- Dec 09, 2010
-
-
Sebastian Siewior authored
This patch changes u32 to __be32 for all "ranges", "prop" and "addr" and such. Those variables are pointing to the device tree which containts intergers in big endian format. Most functions are doing it right because of_read_number() is doing the right thing for them. of_bus_isa_get_flags(), of_bus_pci_get_flags() and of_bus_isa_map() were accessing the data directly and were doing it wrong. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Jesse Larrew authored
Tie the polling mechanism into the ibm,suspend-me rtas call to stop/restart polling before/after a suspend, hibernate, migrate, or checkpoint restart operation. This ensures that the system has a chance to disable the polling if the partition is migrated to a system that does not support VPHN (and vice versa). Signed-off-by:
Jesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Jesse Larrew authored
This patch sets a timer during boot that will periodically poll the associativity change counters in the VPA. When a change in associativity is detected, it retrieves the new associativity domain information via the H_HOME_NODE_ASSOCIATIVITY hcall and updates the NUMA node maps and sysfs entries accordingly. Note that since the ibm,associativity device tree property does not exist on configurations with both NUMA and SPLPAR enabled, no device tree updates are necessary. Signed-off-by:
Jesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Sonny Rao authored
Remove stale declaration of setup_pci_ptrs, aparently from ppc before 2.4.0 Remove #ifdef around struct existance delcaration Fix spelling of "linear" Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Sonny Rao <sonnyrao@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
The popcnt instructions went into binutils relatively recently. As with a number of other instructions, create macros and hardcode them. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Nov 30, 2010
-
-
Benjamin Herrenschmidt authored
The nvram log partition stuff currently in nvram_64.c is really pseries specific. It isn't actually used on anything else (despite the fact that we ran the code to setup the partition on anything except powermac) and the log format is specific to pseries RTAS implementation. So move it where it belongs Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This moves a bunch of definitions out of asm/nvram.h to the files that use them or just outright remove completely unused stuff. We leave the partition signatures definitions, they will be useful Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Nov 29, 2010
-
-
Jesse Larrew authored
This simple patch adds the firmware feature for VPHN to the firmware features bitmask. Signed-off-by:
Jesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nishanth Aravamudan authored
Add a function to get the maximum address that can be hotplug added. This is needed to calculate the size of the tce table needed to cover all memory in 1:1 mode. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nishanth Aravamudan authored
Also add a comment to dev_archdata, indicating that changes there need to be verified against the driver code. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Vaidyanathan Srinivasan authored
Create sysfs interface to export data from H_BEST_ENERGY hcall that can be used by administrative tools on supported pseries platforms for energy management optimizations. sys/device/system/cpu/pseries_(de)activate_hint_list and sys/device/system/cpu/cpuN/pseries_(de)activate_hint will provide hints for activation and deactivation of cpus respectively. These hints are abstract number given by the hypervisor based on the extended knowledge the hypervisor has regarding the system topology and resource mappings. The activate and the deactivate sysfs entry is for the two distinct operations that we could do for energy savings. When we have more capacity than required, we could deactivate few core to save energy. The choice of the core to deactivate will be based on /sys/devices/system/cpu/deactivate_hint_list. The comma separated list of cpus (cores) will be the preferred choice. If we have to activate some of the deactivated cores, then /sys/devices/system/cpu/activate_hint_list will be used. The per-cpu file /sys/device/system/cpu/cpuN/pseries_(de)activate_hint further provide more fine grain information by exporting the value of the hint itself. Added new driver module arch/powerpc/platforms/pseries/pseries_energy.c under new config option CONFIG_PSERIES_ENERGY Signed-off-by:
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Vaidyanathan Srinivasan authored
These APIs take logical cpu number as input Change cpu_first_thread_in_core() to cpu_first_thread_sibling() Change cpu_last_thread_in_core() to cpu_last_thread_sibling() These APIs convert core number (index) to logical cpu/thread numbers Add cpu_first_thread_of_core(int core) Changed cpu_thread_to_core() to cpu_core_index_of_thread(int cpu) The goal is to make 'threads_per_core' accessible to the pseries_energy module. Instead of making an API to read threads_per_core, this is a higher level wrapper function to convert from logical cpu number to core number. The current APIs cpu_first_thread_in_core() and cpu_last_thread_in_core() returns logical CPU number while cpu_thread_to_core() returns core number or index which is not a logical CPU number. The new APIs are now clearly named to distinguish 'core number' versus first and last 'logical cpu number' in that core. The new APIs cpu_{first,last}_thread_sibling() work on logical cpu numbers. While cpu_first_thread_of_core() and cpu_core_index_of_thread() work on core index. Example usage: (4 threads per core system) cpu_first_thread_sibling(5) = 4 cpu_last_thread_sibling(5) = 7 cpu_core_index_of_thread(5) = 1 cpu_first_thread_of_core(1) = 4 cpu_core_index_of_thread() is used in cpu_to_drc_index() in the module and cpu_first_thread_of_core() is used in drc_index_to_cpu() in the module. Make API changes to few callers. Export symbols for use in modules. Signed-off-by:
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Christian Dietrich authored
The __KERNEL__ ifdef isn't necessary at this point, because it is checked in an outer ifdef level already and has no effect here. Signed-off-by:
Christian Dietrich <qy03fugy@stud.informatik.uni-erlangen.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step this patch does all the work out of line, but it would be nice to implement them as inlines with an out of line fallback. The performance issue with hweight was noticed when disabling SMT on a large (192 thread) POWER7 box. The patch improves that testcase by about 8%. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Oct 29, 2010
-
-
Dongdong Deng authored
commit 534af108(kgdb,kdb: individual register set and and get API) introduce dbg_get_reg/dbg_set_reg API for individual register get and set. This patch implement those APIs for ppc. Signed-off-by:
Dongdong Deng <dongdong.deng@windriver.com> Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
- Oct 28, 2010
-
-
Michael Holzheu authored
The taskstats interface uses microsecond granularity for the user and system time values. The conversion from cputime to the taskstats values uses the cputime_to_msecs primitive which effectively limits the granularity to milliseconds. Add the cputime_to_usecs primitive for architectures that have better, more precise CPU time values. Remove cputime_to_msecs primitive because there are no more users left. Signed-off-by:
Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by:
Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Luck Tony <tony.luck@intel.com> Cc: Shailabh Nagar <nagar1234@in.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 26, 2010
-
-
Peter Zijlstra authored
Since we no longer need to provide KM_type, the whole pte_*map_nested() API is now redundant, remove it. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by:
Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by:
Rik van Riel <riel@redhat.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by:
Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 25, 2010
-
-
Lan Chunhe-B25806 authored
When system uses 36bit physical address, res.start is 36bit physical address. But the function of in_be32 returns 32bit physical address. Then both of them compared each other is wrong. So by converting the address of res.start into the right format fixes this issue. Signed-off-by:
Lan Chunhe-B25806 <b25806@freescale.com> Signed-off-by:
Roy Zang <tie-fei.zang@freescale.com> Reviewed-by:
Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by:
David Woodhouse <David.Woodhouse@intel.com>
-
Roy Zang authored
Move Freescale elbc interrupt from nand driver to elbc driver. Then all elbc devices can use the interrupt instead of ONLY nand. For former nand driver, it had the two functions: 1. detecting nand flash partitions; 2. registering elbc interrupt. Now, second function is removed to fsl_lbc.c. Signed-off-by:
Lan Chunhe-B25806 <b25806@freescale.com> Signed-off-by:
Roy Zang <tie-fei.zang@freescale.com> Reviewed-by:
Anton Vorontsov <cbouatmailru@gmail.com> Cc: Wood Scott-B07421 <B07421@freescale.com> Signed-off-by:
David Woodhouse <David.Woodhouse@intel.com>
-
- Oct 24, 2010
-
-
Alexander Graf authored
We have to protect the include for linux/of.h by __KERNEL__ so it doesn't accidently get referenced outside. This patch fixes this and makes the tree compile again. Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
The current interrupt logic is just completely broken. We get a notification from user space, telling us that an interrupt is there. But then user space expects us that we just acknowledge an interrupt once we deliver it to the guest. This is not how real hardware works though. On real hardware, the interrupt controller pulls the external interrupt line until it gets notified that the interrupt was received. So in reality we have two events: pulling and letting go of the interrupt line. To maintain backwards compatibility, I added a new request for the pulling part. The letting go part was implemented earlier already. With this in place, we can now finally start guests that do not randomly stall and stop to work at random times. This patch implements above logic for Book3S. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
Up until now we were doing segment mappings wrong on Book3s_32. For Book3s_64 we were using a trick where we know that a single mmu_context gives us 16 bits of context ids. The mm system on Book3s_32 instead uses a clever algorithm to distribute VSIDs across the available range, so a context id really only gives us 16 available VSIDs. To keep at least a few guest processes in the SID shadow, let's map a number of contexts that we can use as VSID pool. This makes the code be actually correct and shouldn't hurt performance too much. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
Now that the actual mtsr doesn't do anything anymore, we can move the sr contents over to the shared page, so a guest can directly read and write its sr contents from guest context. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
Right now we're examining the contents of Book3s_32's segment registers when the register is written and put the interpreted contents into a struct. There are two reasons this is bad. For starters, the struct has worse real-time performance, as it occupies more ram. But the more important part is that with segment registers being interpreted from their raw values, we can put them in the shared page, allowing guests to mess with them directly. This patch makes the internal representation of SRs be u32s. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
We will soon add SR PV support to the shared page, so we need some infrastructure that allows the guest to query for features KVM exports. This patch adds a second return value to the magic mapping that indicated to the guest which features are available. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
On Book3S KVM we directly expose some asm pointers to C code as variables. These need to be relocated and thus break on relocatable kernels. To make sure we can at least build, let's mark them as long instead of u32 where 64bit relocations don't work. This fixes the following build error: WARNING: 2 bad relocations^M > c000000000008590 R_PPC64_ADDR32 .text+0x4000000000008460^M > c000000000008594 R_PPC64_ADDR32 .text+0x4000000000008598^M Please keep in mind that actually using KVM on a relocated kernel might still break. This only fixes the compile problem. Reported-by:
Subrata Modak <subrata@linux.vnet.ibm.com> Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
On Book3s_32 the tlbie instruction flushed effective addresses by the mask 0x0ffff000. This is pretty hard to reflect with a hash that hashes ~0xfff, so to speed up that target we should also keep a special hash around for it. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
So far we've been running all code without locking of any sort. This wasn't really an issue because I didn't see any parallel access to the shadow MMU code coming. But then I started to implement dirty bitmapping to MOL which has the video code in its own thread, so suddenly we had the dirty bitmap code run in parallel to the shadow mmu code. And with that came trouble. So I went ahead and made the MMU modifying functions as parallelizable as I could think of. I hope I didn't screw up too much RCU logic :-). If you know your way around RCU and locking and what needs to be done when, please take a look at this patch. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
Now that we have the shared page in place and the MMU code knows about the magic page, we can expose that capability to the guest! Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We need to override EA as well as PA lookups for the magic page. When the guest tells us to project it, the magic page overrides any guest mappings. In order to reflect that, we need to hook into all the MMU layers of KVM to force map the magic page if necessary. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We will be introducing a method to project the shared page in guest context. As soon as we're talking about this coupling, the shared page is colled magic page. This patch introduces simple defines, so the follow-up patches are easier to read. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
On PowerPC it's very normal to not support all of the physical RAM in real mode. To check if we're matching on the shared page or not, we need to know the limits so we can restrain ourselves to that range. So let's make it a define instead of open-coding it. And while at it, let's also increase it. Signed-off-by:
Alexander Graf <agraf@suse.de> v2 -> v3: - RMO -> PAM (non-magic page) Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
When the guest turns on interrupts again, it needs to know if we have an interrupt pending for it. Because if so, it should rather get out of guest context and get the interrupt. So we introduce a new field in the shared page that we use to tell the guest that there's a pending interrupt lying around. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
While running in hooked code we need to store register contents out because we must not clobber any registers. So let's add some fields to the shared page we can just happily write to. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
When running in hooked code we need a way to disable interrupts without clobbering any interrupts or exiting out to the hypervisor. To achieve this, we have an additional critical field in the shared page. If that field is equal to the r1 register of the guest, it tells the hypervisor that we're in such a critical section and thus may not receive any interrupts. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
To communicate with KVM directly we need to plumb some sort of interface between the guest and KVM. Usually those interfaces use hypercalls. This hypercall implementation is described in the last patch of the series in a special documentation file. Please read that for further information. This patch implements stubs to handle KVM PPC hypercalls on the host and guest side alike. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
When in kernel mode there are 4 additional registers available that are simple data storage. Instead of exiting to the hypervisor to read and write those, we can just share them with the guest using the page. This patch converts all users of the current field to the shared page. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
The SRR0 and SRR1 registers contain cached values of the PC and MSR respectively. They get written to by the hypervisor when an interrupt occurs or directly by the kernel. They are also used to tell the rfi(d) instruction where to jump to. Because it only gets touched on defined events that, it's very simple to share with the guest. Hypervisor and guest both have full r/w access. This patch converts all users of the current field to the shared page. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
The DAR register contains the address a data page fault occured at. This register behaves pretty much like a simple data storage register that gets written to on data faults. There is no hypervisor interaction required on read or write. This patch converts all users of the current field to the shared page. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-