- Jul 12, 2011
-
-
Paul Mackerras authored
This arranges for the top-level arch/powerpc/kvm/powerpc.c file to pass down some of the calls it gets to the lower-level subarchitecture specific code. The lower-level implementations (in booke.c and book3s.c) are no-ops. The coming book3s_hv.c will need this. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
Instead of branching out-of-line with the DO_KVM macro to check if we are in a KVM guest at the time of an interrupt, this moves the KVM check inline in the first-level interrupt handlers. This speeds up the non-KVM case and makes sure that none of the interrupt handlers are missing the check. Because the first-level interrupt handlers are now larger, some things had to be move out of line in exceptions-64s.S. This all necessitated some minor changes to the interrupt entry code in KVM. This also streamlines the book3s_32 KVM test. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
In preparation for adding code to enable KVM to use hypervisor mode on 64-bit Book 3S processors, this splits book3s.c into two files, book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is specific to running the guest in problem state (user mode) and book3s.c contains code which should apply to all Book 3S processors. In doing this, we abstract some details, namely the interrupt offset, updating the interrupt pending flag, and detecting if the guest is in a critical section. These are all things that will be different when we use hypervisor mode. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
This moves the slb field, which represents the state of the emulated SLB, from the kvmppc_vcpu_book3s struct to the kvm_vcpu_arch, and the hpte_hash_[v]pte[_long] fields from kvm_vcpu_arch to kvmppc_vcpu_book3s. This is in accord with the principle that the kvm_vcpu_arch struct represents the state of the emulated CPU, and the kvmppc_vcpu_book3s struct holds the auxiliary data structures used in the emulation. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Liu Yu authored
Dynamically assign host PIDs to guest PIDs, splitting each guest PID into multiple host (shadow) PIDs based on kernel/user and MSR[IS/DS]. Use both PID0 and PID1 so that the shadow PIDs for the right mode can be selected, that correspond both to guest TID = zero and guest TID = guest PID. This allows us to significantly reduce the frequency of needing to invalidate the entire TLB. When the guest mode or PID changes, we just update the host PID0/PID1. And since the allocation of shadow PIDs is global, multiple guests can share the TLB without conflict. Note that KVM does not yet support the guest setting PID1 or PID2 to a value other than zero. This will need to be fixed for nested KVM to work. Until then, we enforce the requirement for guest PID1/PID2 to stay zero by failing the emulation if the guest tries to set them to something else. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Liu Yu authored
Instead of a fully separate set of TLB entries, keep just the pfn and dirty status. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
This is a shared page used for paravirtualization. It is always present in the guest kernel's effective address space at the address indicated by the hypercall that enables it. The physical address specified by the hypercall is not used, as e500 does not have real mode. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
This is in line with what other architectures do, and will allow us to map things other than ordinary, unreserved kernel pages -- such as dedicated devices, or large contiguous reserved regions. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
This is done lazily. The SPE save will be done only if the guest has used SPE since the last preemption or heavyweight exit. Restore will be done only on demand, when enabling MSR_SPE in the shadow MSR, in response to an SPE fault or mtmsr emulation. For SPEFSCR, Linux already switches it on context switch (non-lazily), so the only remaining bit is to save it between qemu and the guest. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
Keep the guest MSR and the guest-mode true MSR separate, rather than modifying the guest MSR on each guest entry to produce a true MSR. Any bits which should be modified based on guest MSR must be explicitly propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in kvmppc_set_msr(). While we're modifying the guest entry code, reorder a few instructions to bury some load latencies. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
Previously, these macros hardcoded THREAD_EVR0 as the base of the save area, relative to the base register passed. This base offset is now passed as a separate macro parameter, allowing reuse with other SPE save areas, such as used by KVM. Acked-by:
Kumar Gala <galak@kernel.crashing.org> Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
Up until now, Book3S KVM had variables stored in the kernel that a kernel module or the kvm code in the kernel could read from to figure out where some real mode helper functions are located. This is all unnecessary. The high bits of the EA get ignore in real mode, so we can just use the pointer as is. Also, it's a lot easier on relocations when we use the normal way of resolving the address to a function, instead of jumping through hoops. This patch fixes compilation with CONFIG_RELOCATABLE=y. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Jun 27, 2011
-
-
KAMEZAWA Hiroyuki authored
commit 21a3c964 uses node_start/end_pfn(nid) for detection start/end of nodes. But, it's not defined in linux/mmzone.h but defined in /arch/???/include/mmzone.h which is included only under CONFIG_NEED_MULTIPLE_NODES=y. Then, we see mm/page_cgroup.c: In function 'page_cgroup_init': mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn' mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn' So, fixiing page_cgroup.c is an idea... But node_start_pfn()/node_end_pfn() is a very generic macro and should be implemented in the same manner for all archs. (m32r has different implementation...) This patch removes definitions of node_start/end_pfn() in each archs and defines a unified one in linux/mmzone.h. It's not under CONFIG_NEED_MULTIPLE_NODES, now. A result of macro expansion is here (mm/page_cgroup.c) for !NUMA start_pfn = ((&contig_page_data)->node_start_pfn); end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;}); for NUMA (x86-64) start_pfn = ((node_data[nid])->node_start_pfn); end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;}); Changelog: - fixed to avoid using "nid" twice in node_end_pfn() macro. Reported-and-acked-by:
Randy Dunlap <randy.dunlap@oracle.com> Reported-and-tested-by:
Ingo Molnar <mingo@elte.hu> Acked-by:
Mel Gorman <mgorman@suse.de> Signed-off-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jun 02, 2011
-
-
Kumar Gala authored
arch/powerpc/kernel/built-in.o: In function `machine_check_e500mc': arch/powerpc/kernel/traps.c:429: undefined reference to `fsl_rio_mcheck_exception' arch/powerpc/kernel/built-in.o: In function `machine_check_e500': arch/powerpc/kernel/traps.c:519: undefined reference to `fsl_rio_mcheck_exception' make: *** [.tmp_vmlinux1] Error 1 Reported-by:
Timur Tabi <timur@freescale.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
- May 28, 2011
-
-
Eric W. Biederman authored
32bit and 64bit on x86 are tested and working. The rest I have looked at closely and I can't find any problems. setns is an easy system call to wire up. It just takes two ints so I don't expect any weird architecture porting problems. While doing this I have noticed that we have some architectures that are very slow to get new system calls. cris seems to be the slowest where the last system calls wired up were preadv and pwritev. avr32 is weird in that recvmmsg was wired up but never declared in unistd.h. frv is behind with perf_event_open being the last syscall wired up. On h8300 the last system call wired up was epoll_wait. On m32r the last system call wired up was fallocate. mn10300 has recvmmsg as the last system call wired up. The rest seem to at least have syncfs wired up which was new in the 2.6.39. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: ported to v2.6.39-rc6 v6: rebased onto parisc-next and net-next to avoid syscall conflicts. v7: ported to Linus's latest post 2.6.39 tree. > arch/blackfin/include/asm/unistd.h | 3 ++- > arch/blackfin/mach-common/entry.S | 1 + Acked-by:
Mike Frysinger <vapier@gentoo.org> Oh - ia64 wiring looks good. Acked-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 26, 2011
-
-
Milton Miller authored
The cell iic interrupt controller has enough software caused interrupts to use a unique interrupt for each of the 4 messages powerpc uses. This means each interrupt gets its own irq action/data combination. Use the seperate, optimized, arch common ipi action functions registered via the helper smp_request_message_ipi instead passing the message as action data to a single action that then demultipexes to the required acton via a switch statement. smp_request_message_ipi will register the action as IRQF_PER_CPU and IRQF_DISABLED, and WARN if the allocation fails for some reason, so no need to print on that failure. It will return positive if the message will not be used by the kernel, in which case we can free the virq. In addition to elimiating inefficient code, this also corrects the error that a kernel built with kexec but without a debugger would not register the ipi for kdump to notify the other cpus of a crash. This also restores the debugger action to be static to kernel/smp.c. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
When page coalescing support was added recently, the MAX_HCALL_OPCODE define was not updated for the newly added H_GET_MPP_X hcall. Signed-off-by:
Brian King <brking@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Ian Munsie authored
This patch implements the raw syscall tracepoints on PowerPC and exports them for ftrace syscalls to use. To minimise reworking existing code, I slightly re-ordered the thread info flags such that the new TIF_SYSCALL_TRACEPOINT bit would still fit within the 16 bits of the andi. instruction's UI field. The instructions in question are in /arch/powerpc/kernel/entry_{32,64}.S to and the _TIF_SYSCALL_T_OR_A with the thread flags to see if system call tracing is enabled. In the case of 64bit PowerPC, arch_syscall_addr and arch_syscall_match_sym_name are overridden to allow ftrace syscalls to work given the unusual system call table structure and symbol names that start with a period. Signed-off-by:
Ian Munsie <imunsie@au1.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- May 25, 2011
-
-
Peter Zijlstra authored
In case other architectures require RCU freed page-tables to implement gup_fast() and software filled hashes and similar things, provide the means to do so by moving the logic into generic code. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Requested-by:
David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Fix up powerpc to the new mmu_gather stuff. PPC has an extra batching queue to RCU free the actual pagetable allocations, use the ARCH extentions for that for now. For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the hardware hash-table, keep using per-cpu arrays but flush on context switch and use a TLF bit to track the lazy_mmu state. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 24, 2011
-
-
Rafael J. Wysocki authored
All architectures supporting hibernation define arch_prepare_suspend() as an empty function, so remove it. Signed-off-by:
Rafael J. Wysocki <rjw@sisk.pl>
-
- May 22, 2011
-
-
Scott Wood authored
Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
Linux doesn't use USPRG0 (now renamed VRSAVE in the architecture, even when Altivec isn't involved), but a guest might. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
Return the actual host SVR for now, as we already do for PVR. Eventually we may support Qemu overriding PVR/SVR if the situation is appropriate, once we implement KVM_SET_SREGS on e500. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- May 20, 2011
-
-
Shaohui Xie authored
Add support for machine_check support into machine_check_e500 and machine_check_e500mc. Signed-off-by:
Shaohui Xie <b21989@freescale.com> Cc: Li Yang <leoli@freescale.com> Cc: Roy Zang <tie-fei.zang@freescale.com> Cc: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Shengzhou Liu authored
Simultaneous FCM and GPCM or UPM operation may erroneously trigger bus monitor timeout. Set the local bus monitor timeout value to the maximum by setting LBCR[BMT] = 0 and LBCR[BMTPS] = 0xF. Signed-off-by:
Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Paul Mackerras authored
Commits a5d4f3ad ("powerpc: Base support for exceptions using HSRR0/1") and 673b189a ("powerpc: Always use SPRN_SPRG_HSCRATCH0 when running in HV mode") cause compile and link errors for 32-bit classic Book 3S processors when KVM is enabled. This fixes these errors. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- May 19, 2011
-
-
Scott Wood authored
Add support for MPIC timers as requestable interrupt sources. Based on http://patchwork.ozlabs.org/patch/20941/ by Dave Liu. Signed-off-by:
Dave Liu <daveliu@freescale.com> Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Scott Wood authored
Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Scott Wood authored
Without this, we attempt to use doorbells for IPIs, and end up branching to some bad address. Plus, even for the exceptions we don't implement, it's good to handle it and get a message out. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Kumar Gala authored
Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Milton Miller authored
The only references to the irq_map[].host field are internal to arch/powerpc/kernel/irq.c Signed-off-by:
Milton Miller <miltonm@bga.com> Acked-by:
Grant Likely <grant.likely@secretlab.ca> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
Some irq_host implementations are using virq_to_host to check if they are the irq_host for a virtual irq. To allow us to make space versus time tradeoffs, replace this usage with an assertive virq_is_host that confirms or denies the irq is associated with the given irq_host. Signed-off-by:
Milton Miller <miltonm@bga.com> Acked-by:
Grant Likely <grant.likely@secretlab.ca> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
It was called from irq_create_mapping if that was called for a host and hwirq that was previously mapped, "to update the flags". But the only implementation was in beat_interrupt and all it did was repeat a hypervisor call without error checking that was performed with error checking at the beginning of the map hook. In addition, the comment on the beat remap hook says it will only called once for a given mapping, which would apply to map not remap. All flags should be known by the time the match hook is called, before we call the map hook. Removing this mostly unused hook will simpify the requirements of irq_domain concept. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
Compile the new smp ipi mux and demux code only if a platform will make use of it. The new config is selected as required. The new cause_ipi smp op is only available conditionally to point out configs where the select is required; this makes setting the op an immediate fail instead of a deferred unresolved symbol at link. This also creates a new config for power surge powermac upgrade support that can be disabled in expert mode but is default on. I also removed the depends / default y on CONFIG_XICS since it is selected by PSERIES. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
Consolidate the mux and demux of ipi messages into smp.c and call a new smp_ops callback to actually trigger the ipi. The powerpc architecture code is optimised for having 4 distinct ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi single, scheduler ipi, and enter debugger). However, several interrupt controllers only provide a single software triggered interrupt that can be delivered to each cpu. To resolve this limitation, each smp_ops implementation created a per-cpu variable that is manipulated with atomic bitops. Since these lines will be contended they are optimialy marked as shared_aligned and take a full cache line for each cpu. Distro kernels may have 2 or 3 of these in their config, each taking per-cpu space even though at most one will be in use. This consolidation removes smp_message_recv and replaces the single call actions cases with direct calls from the common message recognition loop. The complicated debugger ipi case with its muxed crash handling code is moved to debug_ipi_action which is now called from the demux code (instead of the multi-message action calling smp_message_recv). I put a call to reschedule_action to increase the likelyhood of correctly merging the anticipated scheduler_ipi() hook coming from the scheduler tree; that single required call can be inlined later. The actual message decode is a copy of the old pseries xics code with its memory barriers and cache line spacing, augmented with a per-cpu unsigned long based on the book-e doorbell code. The optional data is set via a callback from the implementation and is passed to the new cause-ipi hook along with the logical cpu number. While currently only the doorbell implemntation uses this data it should be almost zero cost to retrieve and pass it -- it adds a single register load for the argument from the same cache line to which we just completed a store and the register is dead on return from the call. I extended the data element from unsigned int to unsigned long in case some other code wanted to associate a pointer. The doorbell check_self is replaced by a call to smp_muxed_ipi_resend, conditioned on the CPU_DBELL feature. The ifdef guard could be relaxed to CONFIG_SMP but I left it with BOOKE for now. Also, the doorbell interrupt vector for book-e was not calling irq_enter and irq_exit, which throws off cpu accounting and causes code to not realize it is running in interrupt context. Add the missing calls. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
I can't see any reason these functions are needed by machdep.h and they are all hidden by CONFIG_SMP with no UP alternative. Also move the declarations for the fallback timebase ops, which are used to fill in the smp ops. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
Replace all remaining callers of alloc_maybe_bootmem with zalloc_maybe_bootmem. The callsite in pci_dn is followed with a memset to clear the memory, and not zeroing at the other callsites in the celleb fake pci code could lead to following uninitialized memory as pointers or even freeing said pointers on error paths. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
Now that smp_ops->smp_message_pass is always called with an (online) cpu number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Milton Miller authored
The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc, and it only uses it to start the debugger. Both debuggers always call smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do anything more optimal than a loop over all online cpus, but all message passing implementations have to code for this special delivery target. Convert smp_send_debugger_break to take void and loop calling the smp_ops message_pass function for each of the other cpus in the online cpumask. Use raw_smp_processor_id() because we are either entering the debugger or trying to start kdump and the additional warning it not useful were it to trigger. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-