- Mar 12, 2010
-
-
Christoph Hellwig authored
On an architecture that supports 32-bit compat we need to override the reported machine in uname with the 32-bit value. Instead of doing this separately in every architecture introduce a COMPAT_UTS_MACHINE define in <asm/compat.h> and apply it directly in sys_newuname(). Signed-off-by:
Christoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Christoph Hellwig authored
Add a generic implementation of the ipc demultiplexer syscall. Except for s390 and sparc64 all implementations of the sys_ipc are nearly identical. There are slight differences in the types of the parameters, where mips and powerpc as the only 64-bit architectures with sys_ipc use unsigned long for the "third" argument as it gets casted to a pointer later, while it traditionally is an "int" like most other paramters. frv goes even further and uses unsigned long for all parameters execept for "ptr" which is a pointer type everywhere. The change from int to unsigned long for "third" and back to "int" for the others on frv should be fine due to the in-register calling conventions for syscalls (we already had a similar issue with the generic sys_ptrace), but I'd prefer to have the arch maintainers looks over this in details. Except for that h8300, m68k and m68knommu lack an impplementation of the semtimedop sub call which this patch adds, and various architectures have gets used - at least on i386 it seems superflous as the compat code on x86-64 and ia64 doesn't even bother to implement it. [akpm@linux-foundation.org: add sys_ipc to sys_ni.c] Signed-off-by:
Christoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by:
H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Acked-by:
Jesper Nilsson <jesper.nilsson@axis.com> Acked-by:
Russell King <rmk+kernel@arm.linux.org.uk> Acked-by:
David Howells <dhowells@redhat.com> Acked-by:
Kyle McMartin <kyle@mcmartin.ca> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 08, 2010
-
-
Emese Revfy authored
Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by:
Emese Revfy <re.emese@gmail.com> Acked-by:
David Teigland <teigland@redhat.com> Acked-by:
Matt Domsch <Matt_Domsch@dell.com> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by:
Hans J. Koch <hjk@linutronix.de> Acked-by:
Pekka Enberg <penberg@cs.helsinki.fi> Acked-by:
Jens Axboe <jens.axboe@oracle.com> Acked-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- Mar 07, 2010
-
-
Wim Van Sebroeck authored
make the watchdog_info struct const where possible. Signed-off-by:
Wim Van Sebroeck <wim@iguana.be>
-
- Mar 06, 2010
-
-
H Hartley Sweeten authored
The macro any_online_node() is prone to producing sparse warnings due to the local symbol 'node'. Since all the in-tree users are really requesting the first online node (the mask argument is either NODE_MASK_ALL or node_online_map) just use the first_online_node macro and remove the any_online_node macro since there are no users. Signed-off-by:
H Hartley Sweeten <hsweeten@visionengravers.com> Acked-by:
David Rientjes <rientjes@google.com> Reviewed-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Milton Miller <miltonm@bga.com> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Geoff Levand <geoffrey.levand@am.sony.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: David S. Miller <davem@davemloft.net> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 01, 2010
-
-
Liu Yu authored
Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, which is more like hardware behave. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Acked-by:
Hollis Blanchard <hollis@penguinppc.org> Acked-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Liu Yu authored
commit 55fb1027c1cf9797dbdeab48180da530e81b1c39 doesn't update tlbcfg correctly. Fix it. And since guest OS likes 'fixed' hardware, initialize tlbcfg everytime when guest access is useless. So move this part to init code. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Acked-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Liu Yu authored
commit 513579e3 change the way we emulate PVR/PIR, which left PVR/PIR uninitialized on E500, and make guest puzzled. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Acked-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Liu Yu authored
Latest kernel start to access l1csr0 to contron L1. We just tell guest no operation is on going. Signed-off-by:
Liu Yu <yu.liu@freescale.com> Acked-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Marcelo Tosatti authored
cleanup_srcu_struct on VM destruction remains broken: BUG: unable to handle kernel paging request at ffffffffffffffff IP: [<ffffffff802533d2>] srcu_read_lock+0x16/0x21 RIP: 0010:[<ffffffff802533d2>] [<ffffffff802533d2>] srcu_read_lock+0x16/0x21 Call Trace: [<ffffffffa05354c4>] kvm_arch_vcpu_uninit+0x1b/0x48 [kvm] [<ffffffffa05339c6>] kvm_vcpu_uninit+0x9/0x15 [kvm] [<ffffffffa0569f7d>] vmx_free_vcpu+0x7f/0x8f [kvm_intel] [<ffffffffa05357b5>] kvm_arch_destroy_vm+0x78/0x111 [kvm] [<ffffffffa053315b>] kvm_put_kvm+0xd4/0xfe [kvm] Move it to kvm_arch_destroy_vm. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Reported-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Alexander Graf authored
We keep a copy of the MSR around that we use when we go into the guest context. That copy is basically the normal process MSR flags OR some allowed guest specified MSR flags. We also AND the external providers into this, so we get traps on FPU usage when we haven't activated it on the host yet. Currently this calculation is part of the set_msr function that we use whenever we set the guest MSR value. With the external providers, we also have the case that we don't modify the guest's MSR, but only want to update the shadow MSR. So let's move the shadow MSR parts to a separate function that we then use whenever we only need to update it. That way we don't accidently kvm_vcpu_block within a preempt notifier context. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
SRR1 stores more information that just the MSR value. It also stores valuable information about the type of interrupt we received, for example whether the storage interrupt we just got was because of a missing htab entry or not. We use that information to speed up the exit path. Now if we get preempted before we can interpret the shadow_msr values, we get into vcpu_put which then calls the MSR handler, which then sets all the SRR1 information bits in shadow_msr to 0. Great. So let's preserve the SRR1 specific bits in shadow_msr whenever we set the MSR. They don't hurt. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
Commit 7d01b4c3ed2bb33ceaf2d270cb4831a67a76b51b introduced PACA backed vcpu values. With this patch, when a userspace app was setting GPRs before it was actually first loaded, the set values get discarded. This is because vcpu_load loads them from the vcpu backing store that we use whenever we're not owning the PACA. That behavior is not really a major problem, because we don't need it for qemu. Other users (like kvmctl) do have problems with it though, so let's better do it right. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
When our guest starts using either the FPU, Altivec or VSX we need to make sure Linux knows about it and sneak into its process switching code accordingly. This patch makes accesses to the above parts of the system work inside the VM. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
Linux contains quite some bits of code to load FPU, Altivec and VSX lazily for a task. It calls those bits in real mode, coming from an interrupt handler. For KVM we better reuse those, so let's wrap a bit of trampoline magic around them and then we can call them from normal module code. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We need to explicitly only giveup VSX in KVM, so let's export that specific function to module space. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
An SLB entry contains two pieces of information related to size: 1) PTE size 2) SLB size The L bit defines the PTE be "large" (usually means 16MB), SLB_VSID_B_1T defines that the SLB should span 1 GB instead of the default 256MB. Apparently I messed things up and just put those two in one box, shaked it heavily and came up with the current code which handles large pages incorrectly, because it also treats large page SLB entries as "1TB" segment entries. This patch splits those two features apart, making Linux guests boot even when they have > 256MB. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
When we get a program interrupt in guest kernel mode, we try to emulate the instruction. If that doesn't fail, we report to the user and try again - at the exact same instruction pointer. So if the guest kernel really does trigger an invalid instruction, we loop forever. So let's better go and forward program exceptions to the guest when we don't know the instruction we're supposed to emulate. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
When we need to reinject a program interrupt into the guest, we also need to reinject the corresponding flags into the guest. Signed-off-by:
Alexander Graf <agraf@suse.de> Reported-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
The code to unset HID5.dcbz32 is broken. This patch makes it do the right rotate magic. Signed-off-by:
Alexander Graf <agraf@suse.de> Reported-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
Currently we're racy when doing the transition from IR=1 to IR=0, from the module memory entry code to the real mode SLB switching code. To work around that I took a look at the RTAS entry code which is faced with a similar problem and did the same thing: A small helper in linear mapped memory that does mtmsr with IR=0 and then RFIs info the actual handler. Thanks to that trick we can safely take page faults in the entry code and only need to be really wary of what to do as of the SLB switching part. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
Using an RFI in IR=1 is dangerous. We need to set two SRRs and then do an RFI without getting interrupted at all, because every interrupt could potentially overwrite the SRR values. Fortunately, we don't need to RFI in at least this particular case of the code, so we can just replace it with an mtmsr and b. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
To fetch the last instruction we were interrupted on, we enable DR in early exit code, where we are still in a very transitional phase between guest and host state. Most of the time this seemed to work, but another CPU can easily flush our TLB and HTAB which makes us go in the Linux page fault handler which totally breaks because we still use the guest's SLB entries. To work around that, let's introduce a second KVM guest mode that defines that whenever we get a trap, we don't call the Linux handler or go into the KVM exit code, but just jump over the faulting instruction. That way a potentially bad lwz doesn't trigger any faults and we can later on interpret the invalid instruction we fetched as "fetch didn't work". Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We're being horribly racy right now. All the entry and exit code hijacks random fields from the PACA that could easily be used by different code in case we get interrupted, for example by a #MC or even page fault. After discussing this with Ben, we figured it's best to reserve some more space in the PACA and just shove off some vcpu state to there. That way we can drastically improve the readability of the code, make it less racy and less complex. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We now have helpers for the GPRs, so let's also add some for CR and XER. Having them in the PACA simplifies code a lot, as we don't need to care about where to store CC or not to overflow any integers. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
All code in PPC KVM currently accesses gprs in the vcpu struct directly. While there's nothing wrong with that wrt the current way gprs are stored and loaded, it doesn't suffice for the PACA acceleration that will follow in this patchset. So let's just create little wrapper inline functions that we call whenever a GPR needs to be read from or written to. The compiled code shouldn't really change at all for now. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
The PowerPC C ABI defines that registers r14-r31 need to be preserved across function calls. Since our exit handler is written in C, we can make use of that and don't need to reload r14-r31 on every entry/exit cycle. This technique is also used in the BookE code and is called "lightweight exits" there. To follow the tradition, it's called the same in Book3S. So far this optimization was disabled though, as the code didn't do what it was expected to do, but failed to work. This patch fixes and enables lightweight exits again. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Alexander Graf authored
When we're loading bolted entries into the SLB again, we're checking if an entry is in use and only slbmte it when it is. Unfortunately, the check always goes to the skip label of the first entry, resulting in an endless loop when it actually gets triggered. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Marcelo Tosatti authored
Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Marcelo Tosatti authored
Required for SRCU convertion later. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Marcelo Tosatti authored
Have a pointer to an allocated region inside struct kvm. [alex: fix ppc book 3s] Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Alexander Graf authored
Because we now emulate the DEC interrupt according to real life behavior, there's no need to keep the AGGRESSIVE_DEC hack around. Let's just remove it. Signed-off-by:
Alexander Graf <agraf@suse.de> Acked-by:
Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We treated the DEC interrupt like an edge based one. This is not true for Book3s. The DEC keeps firing until mtdec is issued again and thus clears the interrupt line. So let's implement this logic in KVM too. This patch moves the line clearing from the firing of the interrupt to the mtdec emulation. This makes PPC64 guests work without AGGRESSIVE_DEC defined. Signed-off-by:
Alexander Graf <agraf@suse.de> Acked-by:
Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Alexander Graf authored
We're using a switch table to find the irqprio that belongs to a specific interrupt vector. This table is part of the interrupt inject logic. Since we'll add a new function to stop interrupts, let's move this table out of the injection logic into a separate function. Signed-off-by:
Alexander Graf <agraf@suse.de> Acked-by:
Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Avi Kivity authored
s390 doesn't have mmio, this will simplify ifdefing it out. Signed-off-by:
Avi Kivity <avi@redhat.com>
-
- Feb 26, 2010
-
-
Peter Zijlstra authored
Since the cpu argument to hw_perf_group_sched_in() is always smp_processor_id(), simplify the code a little by removing this argument and using the current cpu where needed. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1265890918.5396.3.camel@laptop> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Benjamin Herrenschmidt authored
Anton's commit enabling the use of the lwsync fixup mechanism on 64-bit breaks modules. The lwsync fixup section uses .long instead of the FTR_ENTRY_OFFSET macro used by other fixups sections, and thus will generate 32-bit relocations that our module loader cannot resolve. This changes it to use the same type as other feature sections. Note however that we might want to consider using 32-bit for all the feature fixup offsets and add support for R_PPC_REL32 to module_64.c instead as that would reduce the size of the kernel image. I'll leave that as an exercise for the reader for now... Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Feb 23, 2010
-
-
Bjorn Helgaas authored
No functional change; this converts loops that iterate from 0 to PCI_BUS_NUM_RESOURCES through pci_bus resource[] table to use the pci_bus_for_each_resource() iterator instead. This doesn't change the way resources are stored; it merely removes dependencies on the fact that they're in a table. Signed-off-by:
Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by:
Jesse Barnes <jbarnes@virtuousgeek.org>
-
Dominik Brodowski authored
Now that we return the new resource start position, there is no need to update "struct resource" inside the align function. Therefore, mark the struct resource as const. Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by:
Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by:
Jesse Barnes <jbarnes@virtuousgeek.org>
-