Commits · 10f6d99f0fb186bbca1e9e2905d0d3693f941396 · jan.koester / Linux

Feb 02, 2016

MIPS: traps.c: Don't emulate RDHWR in the CpU #0 exception handler · 10f6d99f

Maciej W. Rozycki authored Jan 30, 2016



In the regular MIPS instruction set RDHWR is encoded with the SPECIAL3
(011111) major opcode.  Therefore it cannot trigger the CpU (Coprocessor
Unusable) exception, and certainly not for coprocessor 0, as the opcode
does not overlap with any of the older ISA reservations, i.e. LWC0
(110000), SWC0 (111000), LDC0 (110100) or SDC0 (111100).  The closest
match might be SDC3 (111111), possibly causing a CpU #3 exception,
however our code does not handle it anyway.  A quick check with a MIPS I
and a MIPS III processor:

CPU0 revision is: 00000220 (R3000)
CPU0 revision is: 00000440 (R4400SC)

indeed indicates that the RI (Reserved Instruction) exception is
triggered.  It's only LL and SC that require emulation in the CpU #0
exception handler as they reuse the LWC0 and SWC0 opcodes respectively.

In the microMIPS instruction set RDHWR is mandatory and triggering the
RI exception is required on unimplemented or disabled register accesses.
Therefore emulating the microMIPS instruction in the CpU #0 exception
handler is not required either.

Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12280/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

10f6d99f

Feb 01, 2016

MIPS: Fix FPU disable with preemption · 00fe56dc

James Hogan authored Feb 01, 2016



The FPU should not be left enabled after a task context switch. This
isn't usually a problem as the FPU enable bit is updated before
returning to userland, however it can potentially mask kernel bugs, and
in fact KVM assumes it won't happen and won't clear the FPU enable bit
before returning to the guest, which allows the guest to use stale FPU
context.

Interrupts and exceptions save and restore most bits of the CP0 Status
register which contains the FPU enable bit (CU1). When the kernel needs
to enable or disable the FPU (for example due to attempted FPU use by
userland, or the scheduler being invoked) both the actual Status
register and the saved value in the userland context are updated.

However this doesn't work correctly with full kernel preemption enabled,
since the FPU enable bit can be cleared from within an interrupt when
the scheduler is invoked, and only the userland context is updated, not
the interrupt context.

For example:
1) Enter kernel with FPU already enabled, TIF_USEDFPU=1, Status.CU1=1
   saved.
2) Take a timer interrupt while in kernel mode, Status.CU1=1 saved.
3) Timer interrupt invokes scheduler to preempt the task, which clears
   TIF_USEDFPU, disables the FPU in Status register (Status.CU1=0), and
   the value stored in user context from step (1), but not the interrupt
   context from step (2).
4) When the process is scheduled back in again Status.CU1=0.
5) The interrupt context from step (2) is restored, which sets
   Status.CU1=1. So from user context point of view, preemption has
   re-enabled FPU!
6) If the scheduler is invoked again (via preemption or voluntarily)
   before returning to userland, TIF_USEDFPU=0 so the FPU is not
   disabled before the task context switch.
7) The next task resumes from the context switch with FPU enabled!

The restoring of the Status register on return from interrupt/exception
is already selective about which bits to restore, leaving the interrupt
mask bits alone so enabling/disabling of CPU interrupt lines can
persist. Extend this to also leave both the CU1 bit (FPU enable) and the
FR bit (which specifies the FPU mode and gets changed with CU1). This
prevents a stale Status value being restored in step (5) above and
persisting through subsequent context switches.

Also switch to the use of definitions from asm/mipsregs.h while we're at
it.

Since this change also affects the restoration of Status register on the
path back to userland, it increases the sensitivity of the kernel to the
problem of the FPU being left enabled, allowing it to propagate to
userland, therefore a warning is also added to lose_fpu_inatomic() to
point out any future reoccurances before they do any damage.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12303/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

00fe56dc

MIPS: Properly disable FPU in start_thread() · 76e5846d

James Hogan authored Feb 01, 2016



start_thread() (called for execve(2)) clears the TIF_USEDFPU flag
without atomically disabling the FPU. With a preemptive kernel, an
unfortunately timed preemption after this could result in another
task (or KVM guest) being scheduled in with the FPU still enabled, since
lose_fpu_inatomic() only turns it off if TIF_USEDFPU is set.

Use lose_fpu(0) instead of the separate FPU / MSA management, which
should do the right thing (drop FPU properly and atomically without
saving state) and will be more future proof.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12302/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

76e5846d

MIPS: Fix buffer overflow in syscall_get_arguments() · f4dce1ff

James Hogan authored Jan 25, 2016



Since commit 4c21b8fd ("MIPS: seccomp: Handle indirect system calls
(o32)"), syscall_get_arguments() attempts to handle o32 indirect syscall
arguments by incrementing both the start argument number and the number
of arguments to fetch. However only the start argument number needs to
be incremented. The number of arguments does not change, they're just
shifted up by one, and in fact the output array is provided by the
caller and is likely only n entries long, so reading more arguments
overflows the output buffer.

In the case of seccomp, this results in it fetching 7 arguments starting
at the 2nd one, which overflows the unsigned long args[6] in
populate_seccomp_data(). This clobbers the $s0 register from
syscall_trace_enter() which __seccomp_phase1_filter() saved onto the
stack, into which syscall_trace_enter() had placed its syscall number
argument. This caused Chromium to crash.

Credit goes to Milko for tracking it down as far as $s0 being clobbered.

Fixes: 4c21b8fd ("MIPS: seccomp: Handle indirect system calls (o32)")
Reported-by: Milko Leporis <milko.leporis@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15-
Patchwork: https://patchwork.linux-mips.org/patch/12213/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

f4dce1ff

Jan 29, 2016

x86/mm/pat: Avoid truncation when converting cpa->numpages to address · 74256377

Matt Fleming authored Jan 29, 2016



There are a couple of nasty truncation bugs lurking in the pageattr
code that can be triggered when mapping EFI regions, e.g. when we pass
a cpa->pgd pointer. Because cpa->numpages is a 32-bit value, shifting
left by PAGE_SHIFT will truncate the resultant address to 32-bits.

Viorel-Cătălin managed to trigger this bug on his Dell machine that
provides a ~5GB EFI region which requires 1236992 pages to be mapped.
When calling populate_pud() the end of the region gets calculated
incorrectly in the following buggy expression,

  end = start + (cpa->numpages << PAGE_SHIFT);

And only 188416 pages are mapped. Next, populate_pud() gets invoked
for a second time because of the loop in __change_page_attr_set_clr(),
only this time no pages get mapped because shifting the remaining
number of pages (1048576) by PAGE_SHIFT is zero. At which point the
loop in __change_page_attr_set_clr() spins forever because we fail to
map progress.

Hitting this bug depends very much on the virtual address we pick to
map the large region at and how many pages we map on the initial run
through the loop. This explains why this issue was only recently hit
with the introduction of commit

  a5caa209 ("x86/efi: Fix boot crash by mapping EFI memmap
   entries bottom-up at runtime, instead of top-down")

It's interesting to note that safe uses of cpa->numpages do exist in
the pageattr code. If instead of shifting ->numpages we multiply by
PAGE_SIZE, no truncation occurs because PAGE_SIZE is a UL value, and
so the result is unsigned long.

To avoid surprises when users try to convert very large cpa->numpages
values to addresses, change the data type from 'int' to 'unsigned
long', thereby making it suitable for shifting by PAGE_SHIFT without
any type casting.

The alternative would be to make liberal use of casting, but that is
far more likely to cause problems in the future when someone adds more
code and fails to cast properly; this bug was difficult enough to
track down in the first place.

Reported-and-tested-by: Viorel-Cătălin Răpițeanu <rapiteanu.catalin@gmail.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=110131
Link: http://lkml.kernel.org/r/1454067370-10374-1-git-send-email-matt@codeblueprint.co.uk


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

74256377

perf/x86: De-obfuscate code · 8f04b853

Peter Zijlstra authored Jan 27, 2016



Get rid of the 'onln' obfuscation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

8f04b853

perf/x86: Fix uninitialized value usage · e01d8718

Peter Zijlstra authored Jan 27, 2016



When calling intel_alt_er() with .idx != EXTRA_REG_RSP_* we will not
initialize alt_idx and then use this uninitialized value to index an
array.

When that is not fatal, it can result in an infinite loop in its
caller __intel_shared_reg_get_constraints(), with IRQs disabled.

Alternative error modes are random memory corruption due to the
cpuc->shared_regs->regs[] array overrun, which manifest in either
get_constraints or put_constraints doing weird stuff.

Only took 6 hours of painful debugging to find this. Neither GCC nor
Smatch warnings flagged this bug.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: ae3f011f ("perf/x86/intel: Fix SLM MSR_OFFCORE_RSP1 valid_mask")
Signed-off-by: Ingo Molnar <mingo@kernel.org>

e01d8718

Jan 28, 2016

powerpc/mm: Fixup _HPAGE_CHG_MASK · 2d19fc63

Aneesh Kumar K.V authored Jan 27, 2016



This was wrongly updated by commit 7aa9a23c ("powerpc, thp: remove
infrastructure for handling splitting PMDs") during the last merge
window. Fix it up.

This could lead to incorrect behaviour in THP and/or mprotect(), at a
minimum.

Fixes: 7aa9a23c ("powerpc, thp: remove infrastructure for handling splitting PMDs")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

2d19fc63

powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8 · 370f06c8

Madhavan Srinivasan authored Jan 25, 2016



Commit 7a786832 ("powerpc/perf: Add an explict flag indicating
presence of SLOT field") introduced the PPMU_HAS_SSLOT flag to remove
the assumption that MMCRA[SLOT] was present when PPMU_ALT_SIPR was not
set.

That commit's changelog also mentions that Power8 does not support
MMCRA[SLOT]. However when the Power8 PMU support was merged, it
errnoeously included the PPMU_HAS_SSLOT flag.

So remove PPMU_HAS_SSLOT from the Power8 flags.

mpe: On systems where MMCRA[SLOT] exists, the field occupies bits 37:39
(IBM numbering). On Power8 bit 37 is reserved, and 38:39 overlap with
the high bits of the Threshold Event Counter Mantissa. I am not aware of
any published events which use the threshold counting mechanism, which
would cause the mantissa bits to be set. So in practice this bug is
unlikely to trigger.

Fixes: e05b9b9e ("powerpc/perf: Power8 PMU support")
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

370f06c8

Jan 27, 2016

Revert "MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function" · b0a119fd

Ralf Baechle authored Jan 27, 2016



This reverts commit 5bdb102b.

Brian Norris <computersforpeace@gmail.com> is reporting:

Ralf,

Please revert this and send it to Linus (or else, I can send it myself).
This is causing build failures, because I didn't take the rest of
Simon's series yet.

drivers/mtd/bcm63xxpart.c: In function 'bcm63xx_parse_cfe_partitions':
drivers/mtd/bcm63xxpart.c:93:2: error: implicit declaration of function
'bcm63xx_nvram_get_psi_size' [-Werror=implicit-function-declaration]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
References: https://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=20160126191607.GA111152%40google.com

b0a119fd

ARM: wire up copy_file_range() syscall · 03590cb5

Russell King authored Jan 27, 2016



Add the copy_file_range() syscall to ARM.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

03590cb5

powerpc/eeh: Fix PE location code · 7e56f627

Gavin Shan authored Dec 02, 2015



In eeh_pe_loc_get(), the PE location code is retrieved from the
"ibm,loc-code" property of the device node for the bridge of the
PE's primary bus. It's not correct because the property indicates
the parent PE's location code.

This reads the correct PE location code from "ibm,io-base-loc-code"
or "ibm,slot-location-code" property of PE parent bus's device node.

Cc: stable@vger.kernel.org # v3.16+
Fixes: 357b2f3d ("powerpc/eeh: Dump PE location code")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

7e56f627

Jan 26, 2016

ARM: 8500/1: fix atags_to_fdt with stack-protector-strong · 7f66cd3f

Kees Cook authored Jan 26, 2016



Building with CONFIG_CC_STACKPROTECTOR_STRONG triggers protection code
generation under CONFIG_ARM_ATAG_DTB_COMPAT but this is too early for
being able to use any of the stack_chk code. Explicitly disable it for
only the atags_to_fdt bits.

Suggested-by: zhxihu <zhxihu@marvell.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

7f66cd3f

x86/mm: Fix types used in pgprot cacheability flags translations · 3625c2c2

Jan Beulich authored Jan 26, 2016



For PAE kernels "unsigned long" is not suitable to hold page protection
flags, since _PAGE_NX doesn't fit there. This is the reason for quite a
few W+X pages getting reported as insecure during boot (observed namely
for the entire initrd range).

Fixes: 281d4078 ("x86: Make page cache mode a real type")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <JGross@suse.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/56A7635602000078000CAFF1@prv-mh.provo.novell.com


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

3625c2c2

arm64: mm: avoid calling apply_to_page_range on empty range · 57adec86

Mika Penttilä authored Jan 26, 2016



Calling apply_to_page_range with an empty range results in a BUG_ON
from the core code. This can be triggered by trying to load the st_drv
module with CONFIG_DEBUG_SET_MODULE_RONX enabled:

  kernel BUG at mm/memory.c:1874!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 3 PID: 1764 Comm: insmod Not tainted 4.5.0-rc1+ #2
  Hardware name: ARM Juno development board (r0) (DT)
  task: ffffffc9763b8000 ti: ffffffc975af8000 task.ti: ffffffc975af8000
  PC is at apply_to_page_range+0x2cc/0x2d0
  LR is at change_memory_common+0x80/0x108

This patch fixes the issue by making change_memory_common (called by the
set_memory_* functions) a NOP when numpages == 0, therefore avoiding the
erroneous call to apply_to_page_range and bringing us into line with x86
and s390.

Cc: <stable@vger.kernel.org>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mika Penttilä <mika.penttila@nextfour.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

57adec86

KVM: s390: fix memory overwrites when vx is disabled · 9abc2a08

David Hildenbrand authored Jan 14, 2016



The kernel now always uses vector registers when available, however KVM
has special logic if support is really enabled for a guest. If support
is disabled, guest_fpregs.fregs will only contain memory for the fpu.
The kernel, however, will store vector registers into that area,
resulting in crazy memory overwrites.

Simply extending that area is not enough, because the format of the
registers also changes. We would have to do additional conversions, making
the code even more complex. Therefore let's directly use one place for
the vector/fpu registers + fpc (in kvm_run). We just have to convert the
data properly when accessing it. This makes current code much easier.

Please note that vector/fpu registers are now always stored to
vcpu->run->s.regs.vrs. Although this data is visible to QEMU and
used for migration, we only guarantee valid values to user space  when
KVM_SYNC_VRS is set. As that is only the case when we have vector
register support, we are on the safe side.

Fixes: b5510d9b ("s390/fpu: always enable the vector facility if it is available")
Cc: stable@vger.kernel.org # v4.4 d9a3a09a s390/kvm: remove dependency on struct save_area definition
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[adopt to d9a3a09a]

9abc2a08

KVM: s390: Enable the KVM-VFIO device · 14b0b4ac

Dong Jia Shi authored Jan 20, 2016



The KVM-VFIO device is used by the QEMU VFIO device. It is used to
record the list of in-use VFIO groups so that KVM can manipulate
them.
While we don't need this on s390 currently, let's try to be like
everyone else.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

14b0b4ac

KVM: s390: fix guest fprs memory leak · 9c7ebb61

David Hildenbrand authored Jan 22, 2016



fprs is never freed, therefore resulting in a memory leak if
kvm_vcpu_init() fails or the vcpu is destroyed.

Fixes: 9977e886 ("s390/kernel: lazy restore fpu registers")
Cc: stable@vger.kernel.org # v4.3+
Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

9c7ebb61

s390/pci: remove iomap sanity checks · f5e44f82

Sebastian Ott authored Jan 22, 2016



Since each iomap_entry handles only one bar of one pci function
(even when disjunct ranges of a bar are mapped) the sanity check
in pci_iomap_range is not needed and can be removed.

Also convert the remaining BUG_ONs to WARN_ONs.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

f5e44f82

s390/pci: set error state for unusable functions · 8ead7efb

Sebastian Ott authored Jan 22, 2016

We receive special notifications from firmware when an error was detected
and a pci function became unusable. Set the error_state accordingly to give
device drivers a hint that they don't need to try error recovery.

Suggested-by: Alexander Schmidt <alexschm@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

8ead7efb

s390/pci: fix bar check · c0cabadd

Sebastian Ott authored Jan 22, 2016

Fix the check which bar space we should map to allow available bars only.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

c0cabadd

s390/pci: resize iomap · c506fff3

Sebastian Ott authored Jan 22, 2016



On s390 we need to maintain a mapping between iomem addresses
and arch specific function identifiers. Currently the mapping
table is created as such that we could span the whole iomem
address space. Since we can only map each bar space from each
possible function we have an upper bound for the number of
mapping entries.

This reduces the size of the iomap from 256K to less than 4K
(using the defconfig).

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

c506fff3

s390/pci: improve ZPCI_* macros · bf19c94d

Sebastian Ott authored Jan 22, 2016



Most of the constants defined in pci_io.h depend on each other
and thus can be calculated.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

bf19c94d

s390/pci: provide ZPCI_ADDR macro · 9e00caae

Sebastian Ott authored Jan 22, 2016



Provide and use a ZPCI_ADDR macro as the complement of ZPCI_IDX
to get rid of some constants in the code.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

9e00caae

s390/pci: adjust IOMAP_MAX_ENTRIES · c2e1fcf3

Sebastian Ott authored Jan 22, 2016



ZPCI_IOMAP_MAX_ENTRIES is off by one. Let's adjust this
for the sake of correctness.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

c2e1fcf3

s390/numa: move numa_init_late() from device to arch_initcall · 2d0f76a6

Michael Holzheu authored Jan 20, 2016



Commit 3e89e1c5 ("hugetlb: make mm and fs code explicitly non-modular")
moves hugetlb_init() from module_init to subsys_initcall.

The hugetlb_init()->hugetlb_register_node() code accesses "node->dev.kobj"
which is initialized in numa_init_late().

Since numa_init_late() is a device_initcall which is called *after*
subsys_initcall the above mentioned patch breaks NUMA on s390.

So fix this and move numa_init_late() to arch_initcall.

Fixes: 3e89e1c5 ("hugetlb: make mm and fs code explicitly non-modular")
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

2d0f76a6

arm64: defconfig: updates for 4.5 · 531d3067

Will Deacon authored Jan 25, 2016



Based on requests, update our defconfig so that:

  - We don't build any modules
  - PL031 is enabled (RTC emulated by qemu)
  - Xen guest support is enabled
  - The Uniphier built-in I2C controller is enabled
  - PCI host controller drivers for the various arm64 SoCs are enabled
  - Device passthrough works on Seattle using SMMU and VFIO
  - The Hisilicon IRQ controller (mbigen) is enabled

Signed-off-by: Will Deacon <will.deacon@arm.com>

531d3067

arm64: errata: Add -mpc-relative-literal-loads to build flags · 67dfa175

dann frazier authored Jan 25, 2016

GCC6 (and Linaro's 2015.12 snapshot of GCC5) has a new default that uses
adrp/ldr or adrp/add to address literal pools. When CONFIG_ARM64_ERRATUM_843419
is enabled, modules built with this toolchain fail to load:

  module libahci: unsupported RELA relocation: 275

This patch fixes the problem by passing '-mpc-relative-literal-loads'
to the compiler.

Cc: stable@vger.kernel.org
Fixes: df057cc7 ("arm64: errata: add module build workaround for erratum #843419")
BugLink: http://bugs.launchpad.net/bugs/1533009


Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Christophe Lyon <christophe.lyon@linaro.org>
Signed-off-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

67dfa175

Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules · 728dabd6

William Cohen authored Jan 21, 2016



By default the aarch64 gcc generates .eh_frame sections.  Unlike
.debug_frame sections, the .eh_frame sections are loaded into memory
when the associated code is loaded.  On an example kernel being built
with this default the .eh_frame section in vmlinux used an extra 1.7MB
of memory.  The x86 disables the creation of the .eh_frame section.
The aarch64 should probably do the same to save some memory.

Signed-off-by: William Cohen <wcohen@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

728dabd6

sh: fix smp_store_mb for !SMP · fb9b050c

Michael S. Tsirkin authored Dec 27, 2015



sh variant of smp_store_mb() calls xchg() on !SMP which is stronger than
implied by both the name and the documentation.

commit 90a3ccb0 ("sh: define __smp_xxx,
fix smp_store_mb for !SMP") was supposed to fix it but
left the bug in place.

Drop smp_store_mb, so that code in asm-generic/barrier.h
will define it correctly depending on CONFIG_SMP.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

fb9b050c

Jan 25, 2016

arm64: Fix an enum typo in mm/dump.c · b3122023

Masanari Iida authored Jan 24, 2016



This patch fixes a typo in mm/dump.c:
"MODUELS_END_NR" should be "MODULES_END_NR".

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

b3122023

arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings · ac15bd63

Catalin Marinas authored Jan 07, 2016



Currently, set_pte_at() only checks the software PTE_WRITE bit for user
mappings when it sets or clears the hardware PTE_RDONLY accordingly. The
kernel ptes are written directly without any modification, relying
solely on the protection bits in macros like PAGE_KERNEL. However,
modifying kernel pte attributes via pte_wrprotect() would be ignored by
set_pte_at(). Since pte_wrprotect() does not set PTE_RDONLY (it only
clears PTE_WRITE), the new permission is not taken into account.

This patch changes set_pte_at() to adjust the read-only permission for
kernel ptes as well. As a side effect, existing PROT_* definitions used
for kernel ioremap*() need to include PTE_DIRTY | PTE_WRITE.

(additionally, white space fix for PTE_KERNEL_ROX)

Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>

ac15bd63

arm64: kernel: fix architected PMU registers unconditional access · f436b2ac

Lorenzo Pieralisi authored Jan 13, 2016



The Performance Monitors extension is an optional feature of the
AArch64 architecture, therefore, in order to access Performance
Monitors registers safely, the kernel should detect the architected
PMU unit presence through the ID_AA64DFR0_EL1 register PMUVer field
before accessing them.

This patch implements a guard by reading the ID_AA64DFR0_EL1 register
PMUVer field to detect the architected PMU presence and prevent accessing
PMU system registers if the Performance Monitors extension is not
implemented in the core.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Fixes: 60792ad3 ("arm64: kernel: enforce pmuserenr_el0 initialization and restore")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>

f436b2ac

arm64: kasan: ensure that the KASAN zero page is mapped read-only · 7b1af979

Ard Biesheuvel authored Jan 11, 2016



When switching from the early KASAN shadow region, which maps the
entire shadow space read-write, to the permanent KASAN shadow region,
which uses a zero page to shadow regions that are not subject to
instrumentation, the lowest level table kasan_zero_pte[] may be
reused unmodified, which means that the mappings of the zero page
that it contains will still be read-write.

So update it explicitly to map the zero page read only when we
activate the permanent mapping.

Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>

7b1af979

arm64: hide __efistub_ aliases from kallsyms · 75feee3d

Ard Biesheuvel authored Jan 15, 2016



Commit e8f3010f ("arm64/efi: isolate EFI stub from the kernel
proper") isolated the EFI stub code from the kernel proper by prefixing
all of its symbols with __efistub_, and selectively allowing access to
core kernel symbols from the stub by emitting __efistub_ aliases for
functions and variables that the stub can access legally.

As an unintended side effect, these aliases are emitted into the
kallsyms symbol table, which means they may turn up in backtraces,
e.g.,

  ...
  PC is at __efistub_memset+0x108/0x200
  LR is at fixup_init+0x3c/0x48
  ...
  [<ffffff8008328608>] __efistub_memset+0x108/0x200
  [<ffffff8008094dcc>] free_initmem+0x2c/0x40
  [<ffffff8008645198>] kernel_init+0x20/0xe0
  [<ffffff8008085cd0>] ret_from_fork+0x10/0x40

The backtrace in question has nothing to do with the EFI stub, but
simply returns one of the several aliases of memset() that have been
recorded in the kallsyms table. This is undesirable, since it may
suggest to people who are not aware of this that the issue they are
seeing is somehow EFI related.

So hide the __efistub_ aliases from kallsyms, by emitting them as
absolute linker symbols explicitly. The distinction between those
and section relative symbols is completely irrelevant to these
definitions, and to the final link we are performing when these
definitions are being taken into account (the distinction is only
relevant to symbols defined inside a section definition when performing
a partial link), and so the resulting values are identical to the
original ones. Since absolute symbols are ignored by kallsyms, this
will result in these values to be omitted from its symbol table.

After this patch, the backtrace generated from the same address looks
like this:
  ...
  PC is at __memset+0x108/0x200
  LR is at fixup_init+0x3c/0x48
  ...
  [<ffffff8008328608>] __memset+0x108/0x200
  [<ffffff8008094dcc>] free_initmem+0x2c/0x40
  [<ffffff8008645198>] kernel_init+0x20/0xe0
  [<ffffff8008085cd0>] ret_from_fork+0x10/0x40

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>

75feee3d

powerpc/mm: Allow user space to map rtas_rmo_buf · e256caa7

Vasant Hegde authored Jan 21, 2016



With commit 90a545e9 (restrict /dev/mem to idle io memory ranges) mapping
rtas_rmo_buf from user space is failing. Hence we are not able to make
RTAS syscall.

This patch calls page_is_rtas_user_buf before calling iomem_is_exclusive
in  devmem_is_allowed(). This will allow user space to map rtas_rmo_buf
and we are able to make RTAS syscall.

Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
CC: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e256caa7

Jan 24, 2016

MIPS: zboot: Add support for serial debug using the PROM · dbb98314

Alban Bedel authored Dec 10, 2015



As most platforms implement the PROM serial interface prom_putchar()
add a simple bridge to allow re-using this code for zboot.

Signed-off-by: Alban Bedel <albeu@free.fr>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11811/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

dbb98314

MIPS: zboot: Avoid useless rebuilds · 25f66096

Alban Bedel authored Dec 10, 2015



Add dummy.o to the targets list, and fill targets automatically from
$(vmlinuzobjs) to avoid having to maintain two lists.

When building with XZ compression copy ashldi3.c to the build
directory to use a different object file for the kernel and zboot.
Without this the same object file need to be build with different
flags which cause a rebuild at every run.

Signed-off-by: Alban Bedel <albeu@free.fr>
Cc: linux-mips@linux-mips.org
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11810/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

25f66096

MIPS: BMIPS: Enable ARCH_WANT_OPTIONAL_GPIOLIB · a7b43812

Florian Fainelli authored Jan 06, 2016



Allow BMIPS_GENERIC supported platforms to build GPIO controller
drivers.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Dragan Stancevic <dragan.stancevic@gmail.com>
Cc: cernekee@gmail.com
Cc: jaedon.shin@gmail.com
Cc: gregory.0xf0@gmail.com
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12019/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

a7b43812

MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function · 5bdb102b

Simon Arlott authored Dec 13, 2015



Remove bcm63xx_nvram_get_psi_size() as it now has no users.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Jonas Gorski <jogo@openwrt.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: MIPS Mailing List <linux-mips@linux-mips.org>
Cc: MTD Maling List <linux-mtd@lists.infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11836/


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

5bdb102b