Commits · 70ad237515d99595ed03848bd8e549e50e83c4f2 · jan.koester / Linux

Sep 25, 2014

powerpc: Fix warning reported by verify_cpu_node_mapping() · 70ad2375

Li Zhong authored Aug 27, 2014



With commit 2fabf084 ("powerpc: reorder per-cpu NUMA information's
initialization"), during boottime, cpu_numa_callback() is called
earlier(before their online) for each cpu, and verify_cpu_node_mapping()
uses cpu_to_node() to check whether siblings are in the same node.

It skips the checking for siblings that are not online yet. So the only
check done here is for the bootcpu, which is online at that time. But
the per-cpu numa_node cpu_to_node() uses hasn't been set up yet (which
will be set up in smp_prepare_cpus()).

So I saw something like following reported:
[    0.000000] CPU thread siblings 1/2/3 and 0 don't belong to the same
node!

As we don't actually do the checking during this early stage, so maybe
we could directly call numa_setup_cpu() in do_init_bootmem().

Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

70ad2375

powerpc: Implement emulation of string loads and stores · c9f6f4ed

Paul Mackerras authored Sep 02, 2014



The size field of the op.type word is now the total number of bytes
to be loaded or stored.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c9f6f4ed

powerpc: Emulate icbi, mcrf and conditional-trap instructions · cf87c3f6

Paul Mackerras authored Sep 02, 2014



This extends the instruction emulation done by analyse_instr() and
emulate_step() to handle a few more instructions that are found in
the kernel.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cf87c3f6

powerpc: Split out instruction analysis part of emulate_step() · be96f633

Paul Mackerras authored Sep 02, 2014



This splits out the instruction analysis part of emulate_step() into
a separate analyse_instr() function, which decodes the instruction,
but doesn't execute any load or store instructions.  It does execute
integer instructions and branches which can be executed purely by
updating register values in the pt_regs struct.  For other instructions,
it returns the instruction type and other details in a new
instruction_op struct.  emulate_step() then uses that information
to execute loads, stores, cache operations, mfmsr, mtmsr[d], and
(on 64-bit) sc instructions.

The reason for doing this is so that the KVM code can use it instead
of having its own separate instruction emulation code.  Possibly the
alignment interrupt handler could also use this.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

be96f633

powerpc: Check flat device tree version at boot · ad72a279

Michael Ellerman authored Aug 28, 2014



In commit e6a6928c "of/fdt: Convert FDT functions to use libfdt",
the kernel stopped supporting old flat device tree formats. The minimum
supported version is now 0x10.

There was a checking function added, early_init_dt_verify(), but it's
not called on powerpc.

The result is, if you boot with an old flat device tree, the kernel will
fail to parse it correctly, think you have no memory etc. and hilarity
ensues.

We can't really fix it, but we can at least catch the fact that the
device tree is in an unsupported format and panic(). We can't call
BUG(), it's too early.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

ad72a279

powerpc/powernv: Don't call generic code on offline cpus · d6a4f709

Paul Mackerras authored Sep 02, 2014

On PowerNV platforms, when a CPU is offline, we put it into nap mode.
It's possible that the CPU wakes up from nap mode while it is still
offline due to a stray IPI. A misdirected device interrupt could also
potentially cause it to wake up. In that circumstance, we need to clear
the interrupt so that the CPU can go back to nap mode.

In the past the clearing of the interrupt was accomplished by briefly
enabling interrupts and allowing the normal interrupt handling code
(do_IRQ() etc.) to handle the interrupt. This has the problem that
this code calls irq_enter() and irq_exit(), which call functions such
as account_system_vtime() which use RCU internally. Use of RCU is not
permitted on offline CPUs and will trigger errors if RCU checking is
enabled.

To avoid calling into any generic code which might use RCU, we adopt
a different method of clearing interrupts on offline CPUs. Since we
are on the PowerNV platform, we know that the system interrupt
controller is a XICS being driven directly (i.e. not via hcalls) by
the kernel. Hence this adds a new icp_native_flush_interrupt()
function to the native-mode XICS driver and arranges to call that
when an offline CPU is woken from nap. This new function reads the
interrupt from the XICS. If it is an IPI, it clears the IPI; if it
is a device interrupt, it prints a warning and disables the source.
Then it does the end-of-interrupt processing for the interrupt.

The other thing that briefly enabling interrupts did was to check and
clear the irq_happened flag in this CPU's PACA. Therefore, after
flushing the interrupt from the XICS, we also clear all bits except
the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the
irq_happened flag. The PACA_IRQ_HARD_DIS flag is set by power7_nap()
and is left set to indicate that interrupts are hard disabled. This
means we then have to ignore that flag in power7_nap(), which is
reasonable since it doesn't indicate that any interrupt event needs
servicing.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d6a4f709

powerpc: Use CONFIG_ARCH_HAS_FAST_MULTIPLIER · 423216ed

Anton Blanchard authored Sep 16, 2014



I ran some tests to compare hash_64 using shifts and multiplies.
The results:

POWER6:	~2x slower
POWER7: ~2x faster
POWER8: ~2x faster

Now we have a proper config option, select
CONFIG_ARCH_HAS_FAST_MULTIPLIER on POWER7 and POWER8.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

423216ed

powerpc: Add POWER8 CPU selection · ff2e466a

Anton Blanchard authored Sep 16, 2014



This allows the user to build a kernel targeted at POWER8
(ie gcc -mcpu=power8).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

ff2e466a

pseries: Fix endian issues in cpu hot-removal · e36d1227

Thomas Falcon authored Sep 12, 2014



When removing a cpu, this patch makes sure that values
gotten from or passed to firmware are in the correct
endian format.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e36d1227

pseries: Fix endian issues in onlining cpu threads · 822e7122

Thomas Falcon authored Sep 12, 2014



The ibm,ppc-interrupt-server#s property is in big endian format.
These values need to be converted when used by little endian
architectures.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

822e7122

powerpc: Simplify symbol check in prom_init_check.sh · fe921c8c

Andreas Schwab authored Sep 13, 2014



Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

fe921c8c

powerpc: make of_device_ids const · ce6d73c9

Uwe Kleine-König authored Sep 10, 2014



of_device_ids (i.e. compatible strings and the respective data) are not
supposed to change at runtime. All functions working with of_device_ids
provided by <linux/of.h> work with const of_device_ids. This allows to
mark all struct of_device_id const, too.

While touching these line also put the __init annotation at the right
position where necessary.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

ce6d73c9

powerpc/jump_label: use HAVE_JUMP_LABEL? · d4fe0965

Zhouyi Zhou authored Aug 21, 2014



CONFIG_JUMP_LABEL doesn't ensure HAVE_JUMP_LABEL, if it
is not the case use maintainers's own mutex to guard
the modification of global values.

Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d4fe0965

powerpc: Export dcr_ind_lock to fix build error · 22e55fcf

Pranith Kumar authored Aug 19, 2014



Fix build error caused by missing export:

ERROR: "dcr_ind_lock" [drivers/net/ethernet/ibm/emac/ibm_emac.ko] undefined!

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

22e55fcf

powerpc: Move htab_remove_mapping function prototype into header file · f6026df1

Anton Blanchard authored Aug 20, 2014



A recent patch added a function prototype for htab_remove_mapping in
c code. Fix it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

f6026df1

powerpc: Remove stale function prototypes · a38efcea

Anton Blanchard authored Aug 20, 2014



There were a number of prototypes for functions that no longer
exist. Remove them.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

a38efcea

powerpc: Ensure global functions include their prototype · 1217d34b

Anton Blanchard authored Aug 20, 2014



Fix a number of places where global functions were not including
their prototype. This ensures the prototype and the function match.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

1217d34b

powerpc: Make a bunch of things static · e51df2c1

Anton Blanchard authored Aug 20, 2014



Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e51df2c1

powerpc: Separate ppc32 symbol exports into ppc_ksyms_32.c · 5144b6bf

Anton Blanchard authored Aug 20, 2014



Simplify things considerably by moving all the ppc32 specific
symbol exports into its own file.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

5144b6bf

powerpc: Move lib symbol exports into arch/powerpc/lib/ppc_ksyms.c · 7b20a955

Anton Blanchard authored Aug 20, 2014



Move the lib symbol exports closer to their function definitions

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

7b20a955

powerpc: Remove unused 32bit symbol exports · 5889bafa

Anton Blanchard authored Aug 20, 2014



Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

5889bafa

powerpc: Move more symbol exports next to function definitions · e1802b06
Anton Blanchard authored Aug 20, 2014
```
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
```
e1802b06
powerpc: Move via-cuda symbol exports next to function definitions · 4a1b08e8
Anton Blanchard authored Aug 20, 2014
```
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
```
4a1b08e8
powerpc: Move adb symbol exports next to function definitions · 370a3abd
Anton Blanchard authored Aug 20, 2014
```
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
```
370a3abd

powerpc/powernv: Check OPAL dump calls exist before using · 831cf65b

Michael Neuling authored Aug 19, 2014



Check that the OPAL_DUMP_READ token exists before initalising the elog
infrastructure.

This avoids littering the OPAL console with:
  "OPAL: Called with bad token 91"

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

831cf65b

powerpc/powernv: Check OPAL elog calls exist before using · 7dc992ec

Michael Neuling authored Aug 19, 2014



Check that the OPAL_ELOG_READ token exists before initalising the elog
infrastructure.

This avoids littering the OPAL console with:
  "OPAL: Called with bad token 74"

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

7dc992ec

powerpc/powernv: Check OPAL RTC calls exists before using · 035ed26f

Michael Neuling authored Aug 19, 2014



Check that the OPAL_RTC_READ token exists before we use the OPAL RTC.

Refactors the code a little to merge error paths.

This avoids littering the OPAL console with:
  "OPAL: Called with bad token 3".

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

035ed26f

powerpc/powernv: Add OPAL check token call · bffe6bda

Michael Neuling authored Aug 19, 2014



Currently there is no way to generically check if an OPAL call exists or not
from the host kernel.

This adds an OPAL call opal_check_token() which tells you if the given token is
present in OPAL or not.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

bffe6bda

powerpc: Fix build error with CONFIG_PCI=n · 3484a31f

Pranith Kumar authored Aug 18, 2014

Fix ppc 32 build failure as reported here:

http://kisskb.ellerman.id.au/kisskb/buildresult/11663513/



The error is as follows:

arch/powerpc/include/asm/floppy.h:142:20: error: 'isa_bridge_pcidev' undeclared
(first use in this function)

This is happening since floppy.o is enabled by BLK_DEV_FD which depends on
ARCH_MAY_HAVE_PC_FDC which is in-turn enabled if PPC_PSERIES=n.

The following commit changes the dependency so that ARCH_MAY_HAVE_PC_FDC is
dependent exclusively on PCI since otherwise it will not compile.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

3484a31f

powerpc/boot: Don't install zImage.* from make install · c913e5f9

Tony Breeds authored Aug 14, 2014



in commit 29f1aff2 (powerpc: Copy bootable images in the default
install script) we changed to copying all the built boot targets based
on the assumption that it's backwards compatible.  It turns out that
debian devived installkernel scripts will barf if not given exactly 4
args.

This change reverts make install to just install the vmlinux (we can
change the dfault in a seperate patch) and introduces a new make
zInstall which works with a more flexible installkernel script.

Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c913e5f9

powerpc/powernv: Improve error messages in dump code · cdd91b89

Vasant Hegde authored Aug 14, 2014



Presently we only support initiating Service Processor dump from host.
Hence update sysfs message. Also update couple of other error/info
messages.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cdd91b89

Sep 23, 2014

powerpc/pseries: Drop unnecessary continue · 2172d660

Himangi Saraogi authored Aug 13, 2014



Continue is not needed at the bottom of a loop.

The Coccinelle semantic patch implementing this change is:

@@
@@

for (...;...;...) {
  ...
  if (...) {
    ...
-   continue;
  }
}

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

2172d660

Sep 13, 2014

parisc: Implement new LWS CAS supporting 64 bit operations. · 89206491

Guy Martin authored Sep 12, 2014



The current LWS cas only works correctly for 32bit. The new LWS allows
for CAS operations of variable size.

Signed-off-by: Guy Martin <gmsoft@tuxicoman.be>
Cc: <stable@vger.kernel.org> # 3.13+
Signed-off-by: Helge Deller <deller@gmx.de>

89206491

Make ARCH_HAS_FAST_MULTIPLIER a real config variable · 72d93104

Linus Torvalds authored Sep 13, 2014



It used to be an ad-hoc hack defined by the x86 version of
<asm/bitops.h> that enabled a couple of library routines to know whether
an integer multiply is faster than repeated shifts and additions.

This just makes it use the real Kconfig system instead, and makes x86
(which was the only architecture that did this) select the option.

NOTE! Even for x86, this really is kind of wrong.  If we cared, we would
probably not enable this for builds optimized for netburst (P4), where
shifts-and-adds are generally faster than multiplies.  This patch does
*not* change that kind of logic, though, it is purely a syntactic change
with no code changes.

This was triggered by the fact that we have other places that really
want to know "do I want to expand multiples by constants by hand or
not", particularly the hash generation code.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

72d93104

Sep 11, 2014

xen/arm: remove mach_to_phys rbtree · d50582e0

Stefano Stabellini authored Sep 10, 2014



Remove the rbtree used to keep track of machine to physical mappings:
the frontend can grant the same page multiple times, leading to errors
inserting or removing entries from the mach_to_phys tree.

Linux only needed to know the physical address corresponding to a given
machine address in swiotlb-xen. Now that swiotlb-xen can call the
xen_dma_* functions passing the machine address directly, we can remove
it.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Denis Schneider <v1ne2go@gmail.com>

d50582e0

xen/arm: reimplement xen_dma_unmap_page & friends · 340720be

Stefano Stabellini authored Sep 10, 2014

xen_dma_unmap_page, xen_dma_sync_single_for_cpu and
xen_dma_sync_single_for_device are currently implemented by calling into
the corresponding generic ARM implementation of these functions. In
order to do this, firstly the dma_addr_t handle, that on Xen is a
machine address, needs to be translated into a physical address. The
operation is expensive and inaccurate, given that a single machine
address can correspond to multiple physical addresses in one domain,
because the same page can be granted multiple times by the frontend.

To avoid this problem, we introduce a Xen specific implementation of
xen_dma_unmap_page, xen_dma_sync_single_for_cpu and
xen_dma_sync_single_for_device, that can operate on machine addresses
directly.

The new implementation relies on the fact that the hypervisor creates a
second p2m mapping of any grant pages at physical address == machine
address of the page for dom0. Therefore we can access memory at physical
address == dma_addr_r handle and perform the cache flushing there. Some
cache maintenance operations require a virtual address. Instead of using
ioremap_cache, that is not safe in interrupt context, we allocate a
per-cpu PAGE_KERNEL scratch page and we manually update the pte for it.

arm64 doesn't need cache maintenance operations on unmap for now.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Denis Schneider <v1ne2go@gmail.com>

340720be

xen/arm: introduce XENFEAT_grant_map_identity · 5ebc77de

Stefano Stabellini authored Sep 10, 2014



The flag tells us that the hypervisor maps a grant page to guest
physical address == machine address of the page in addition to the
normal grant mapping address. It is needed to properly issue cache
maintenance operation at the completion of a DMA operation involving a
foreign grant.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Denis Schneider <v1ne2go@gmail.com>

5ebc77de

arm64: flush TLS registers during exec · eb35bdd7

Will Deacon authored Sep 11, 2014



Nathan reports that we leak TLS information from the parent context
during an exec, as we don't clear the TLS registers when flushing the
thread state.

This patch updates the flushing code so that we:

  (1) Unconditionally zero the tpidr_el0 register (since this is fully
      context switched for native tasks and zeroed for compat tasks)

  (2) Zero the tp_value state in thread_info before clearing the
      tpidrr0_el0 register for compat tasks (since this is only writable
      by the set_tls compat syscall and therefore not fully switched).

A missing compiler barrier is also added to the compat set_tls syscall.

Cc: <stable@vger.kernel.org>
Acked-by: Nathan Lynch <Nathan_Lynch@mentor.com>
Reported-by: Nathan Lynch <Nathan_Lynch@mentor.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

eb35bdd7

Sep 10, 2014

sh: get_user_pages_fast() must flush cache · caac7e6d

Stas Sergeev authored Sep 09, 2014

This patch avoids fuse hangs on sh4 by flushing the cache on
get_user_pages_fast().  This is not necessary a good thing to do, but
get_user_pages() does this, so get_user_pages_fast() should too.

Please note the patch for mips arch that addresses the similar problem:
  https://kernel.googlesource.com/pub/scm/linux/kernel/git/ralf/linux/+/linux-3.4.50%5E!/#F0



They basically simply disable get_user_pages_fast() at all, using a
fall-back to get_user_pages().  But my fix is different, it adds an
explicit cache flushes.

Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Kamal Dasu <kdasu.kdev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

caac7e6d

x86/xen: don't copy bogus duplicate entries into kernel page tables · 0b5a5063

Stefan Bader authored Sep 02, 2014



When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded
modules exceeds 512 MiB, then loading modules fails with a warning
(and hence a vmalloc allocation failure) because the PTEs for the
newly-allocated vmalloc address space are not zero.

  WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128
           vmap_page_range_noflush+0x2a1/0x360()

This is caused by xen_setup_kernel_pagetables() copying
level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present
entries.

Without KASLR, the normal kernel image size only covers the first half
of level2_kernel_pgt and module space starts after that.

L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[  0..255]->kernel
                                                  [256..511]->module
                          [511]->level2_fixmap_pgt[  0..505]->module

This allows 512 MiB of of module vmalloc space to be used before
having to use the corrupted level2_fixmap_pgt entries.

With KASLR enabled, the kernel image uses the full PUD range of 1G and
module space starts in the level2_fixmap_pgt. So basically:

L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[0..511]->kernel
                          [511]->level2_fixmap_pgt[0..505]->module

And now no module vmalloc space can be used without using the corrupt
level2_fixmap_pgt entries.

Fix this by properly converting the level2_fixmap_pgt entries to MFNs,
and setting level1_fixmap_pgt as read-only.

A number of comments were also using the the wrong L3 offset for
level2_kernel_pgt.  These have been corrected.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org

0b5a5063