- Jan 30, 2009
-
-
Jeremy Fitzhardinge authored
Impact: Optimization Each asm paravirt-ops call says what registers are available for clobbering. This patch makes use of this to selectively save/restore registers around each pvops call. In many cases this significantly shrinks code size. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
H. Peter Anvin <hpa@zytor.com>
-
Jeremy Fitzhardinge authored
Impact: Optimization Several paravirt ops implementations simply return their arguments, the most obvious being the make_pte/pte_val class of operations on native. On 32-bit, the identity function is literally a no-op, as the calling convention uses the same registers for the first argument and return. On 64-bit, it can be implemented with a single "mov". This patch adds special identity functions for 32 and 64 bit argument, and machinery to recognize them and replace them with either nops or a mov as appropriate. At the moment, the only users for the identity functions are the pagetable entry conversion functions. The result is a measureable improvement on pagetable-heavy benchmarks (2-3%, reducing the pvops overhead from 5 to 2%). Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
H. Peter Anvin <hpa@zytor.com>
-
Randy Dunlap authored
Move DMA-mapping.txt to Documentation/PCI/. DMA-mapping.txt was supposed to be moved from Documentation/ to Documentation/PCI/. The 00-INDEX files in those two directories were updated, along with a few other text files, but the file itself somehow escaped being moved, so move it and update more text files and source files with its new location. Signed-off-by:
Randy Dunlap <randy.dunlap@oracle.com> Acked-by:
Greg Kroah-Hartman <gregkh@suse.de> cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 27, 2009
-
-
Brian Gerst authored
Impact: sync 32 and 64-bit code Merge load_gs_base() into switch_to_new_gdt(). Load the GDT and per-cpu state for the boot cpu when its new area is set up. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: optimization mb() generates an mfence instruction, which is not needed here. Only a compiler barrier is needed, and that is handled by the memory clobber in the wrmsrl function. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: cleanup Rename init_gdt() to setup_percpu_segment(), and move it to setup_percpu.c. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: Code movement, no functional change. Move setup_cpu_local_masks() to kernel/cpu/common.c, where the masks are defined. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: Code movement, no functional change. Move the 64-bit NUMA code from setup_percpu.c to numa_64.c Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Jan 25, 2009
-
-
Ingo Molnar authored
the RDC and ELAN platforms use slighly different PIT clocks, resulting in a timex.h hack that changes PIT_TICK_RATE during build time. But if a tester enables any of these platform support .config options, the PIT will be miscalibrated on standard PC platforms. So use one frequency - in a subsequent patch we'll add a quirk to allow x86 platforms to define different PIT frequencies. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jan 23, 2009
-
-
Peter Zijlstra authored
On -rt we were seeing spurious bad page states like: Bad page state in process 'firefox' page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0 Trying to fix it up, but a reboot is needed Backtrace: Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3 [<c043d0f3>] ? printk+0x14/0x19 [<c0272d4e>] bad_page+0x4e/0x79 [<c0273831>] free_hot_cold_page+0x5b/0x1d3 [<c02739f6>] free_hot_page+0xf/0x11 [<c0273a18>] __free_pages+0x20/0x2b [<c027d170>] __pte_alloc+0x87/0x91 [<c027d25e>] handle_mm_fault+0xe4/0x733 [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63 [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63 [<c0218875>] do_page_fault+0x36f/0x88a This is the case where a concurrent fault already installed the PTE and we get to free the newly allocated one. This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl) which is overlaid with the {private, mapping} struct. union { struct { unsigned long private; struct address_space *mapping; }; spinlock_t ptl; struct kmem_cache *slab; struct page *first_page; }; Normally the spinlock is small enough to not stomp on page->mapping, but PREEMPT_RT=y has huge 'spin'locks. But lockdep kernels should also be able to trigger this splat, as the lock tracking code grows the spinlock to cover page->mapping. The obvious fix is calling pgtable_page_dtor() like the regular pte free path __pte_free_tlb() does. It seems all architectures except x86 and nm10300 already do this, and nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it doesn't do SMP or simply doesnt do MMU at all or something. Signed-off-by:
Peter Zijlstra <a.p.zijlsta@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>
-
Brian Gerst authored
Impact: shrink size of irq_cpustat_t when possible Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: cleanup Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: better code generation and removal of unused field for 32bit In general, use the 64-bit version. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: cleanup APIC definitions aren't needed here. Remove the include and fix up the fallout. tj: added include to mce_intel_64.c. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: bogus irq_cpustat field removed idle_timestamp is left over from the removed irqbalance code. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Jan 22, 2009
-
-
Jeremy Fitzhardinge authored
It's not necessary to deconstruct and reconstruct a pte every time its flags are being updated. Introduce pte_set_flags and pte_clear_flags to set and clear flags in a pte. This allows the flag manipulation code to be inlined, and avoids calls via paravirt-ops. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Jeremy Fitzhardinge authored
pte_flags() was introduced as a new pvop in order to extract just the flags portion of a pte, which is a potentially cheaper operation than extracting the page number as well. It turns out this operation is not needed, because simply using a mask to extract the flags from a pte is sufficient for all current users. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jan 21, 2009
-
-
H. Peter Anvin authored
Impact: None (new bit definitions currently unused) Add bit definitions for the MSR_IA32_MISC_ENABLE MSRs to <asm/msr-index.h>. Signed-off-by:
H. Peter Anvin <hpa@linux.intel.com>
-
Nicholas Piggin authored
Make X86 SGI Ultraviolet support configurable. Saves about 13K of text size on my modest config. text data bss dec hex filename 6770537 1158680 694356 8623573 8395d5 vmlinux 6757492 1157664 694228 8609384 835e68 vmlinux.nouv Signed-off-by:
Nick Piggin <npiggin@suse.de> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
This reverts commit 4217458d. Justin Madru bisected this commit, it was causing weird Firefox crashes. The reason is that GCC mis-optimizes (re-uses) the on-stack parameters of the calling frame, which corrupts the syscall return pt_regs state and thus corrupts user-space register state. So we go back to the slightly less clean but more optimization-safe method of getting to pt_regs. Also add a comment to explain this. Resolves: http://bugzilla.kernel.org/show_bug.cgi?id=12505 Reported-and-bisected-by:
Justin Madru <jdm64@gawab.com> Tested-by:
Justin Madru <jdm64@gawab.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Tejun Heo authored
Impact: less contention when issuing invalidate IPI, cleanup Make x86_32 use the same tlb code as 64bit. The 64bit code uses multiple IPI vectors for tlb shootdown to reduce contention. This patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the code paths. Note that the usage of asmlinkage is inconsistent for x86_32 and 64 and calls for further cleanup. This has been noted with a FIXME comment in tlb_64.c. Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: clean up, ipi vector number reordering for x86_32 Make the following changes to prepare for tlb merge. * reorder x86_32 ip vectors * adjust tlb_32.c and tlb_64.c such that their logics coincide exactly - on spurious invalidate ipi, tlb_32 acks the irq - tlb_64 now has proper memory barriers around clearing flush_cpumask (no change in generated code) * unexport flush_tlb_page from tlb_32.c, there's no user * use unsigned int for cpu id * drop unnecessary includes from tlb_64.c Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup Make the following uv related cleanups. * collect visible uv related definitions and interfaces into uv/uv.h and use it. this cleans up the messy situation where on 64bit, uv is defined properly, on 32bit generic it's dummy and on the rest undefined. after this clean up, uv is defined on 64 and dummy on 32. * update uv_flush_tlb_others() such that it takes cpumask of to-be-flushed cpus as argument, instead of that minus self, and returns yet-to-be-flushed cpumask, instead of modifying the passed in parameter. this interface change will ease dummy implementation of uv_flush_tlb_others() and makes uv tlb flush related stuff defined in tlb_uv proper. Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: cleanup, better irq_regs code generation for x86_64 Make 64-bit use the same optimizations as 32-bit. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: cleanup tj: * changed cpu to unsigned as was done on mmu_context_64.h as cpu id is officially unsigned int * added missing ';' to 32bit version of deactivate_mm() Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: slightly better code generation for percpu_to_op() The processor will sign-extend 32-bit immediate values in 64-bit operations. Use the 'e' constraint ("32-bit signed integer constant, or a symbolic reference known to fit that range") for 64-bit constants. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup In switch_to(), instead of taking offset to irq_stack_union.stack, make it a proper percpu access using __percpu_arg() and per_cpu_var(). Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Jan 20, 2009
-
-
Brian Gerst authored
Impact: cleanup Signed-off-by:
Brian Gerst <brgerst@gmail.com>
-
Brian Gerst authored
Impact: x86_64 percpu area layout change, irq_stack now at the beginning Now that the PDA is empty except for the stack canary, it can be removed. The irqstack is moved to the start of the per-cpu section. If the stack protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack. tj: * updated subject * dropped asm relocation of irq_stack_ptr * updated comments a bit * rebased on top of stack canary changes Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Impact: cleanup Copy the code to cpu_init() to satisfy the requirement that the cpu be reinitialized. Remove all other calls, since the segments are already initialized in head_64.S. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: no unnecessary stack canary swapping during context switch There's no point in moving stack_canary around during context switch if it's not enabled. Conditionalize it. Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup Make the following cleanups. * remove duplicate comment from boot_init_stack_canary() which fits better in the other place - cpu_idle(). * move stack_canary offset check from __switch_to() to boot_init_stack_canary(). Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Jan 18, 2009
-
-
Brian Gerst authored
Accessing memory through %gs should not use rip-relative addressing. Adding a P prefix for the argument tells gcc to not add (%rip) to the memory references. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
tj: s/isidle/is_idle/ Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
tj: * s/nodenumber/node_number/ * removed now unused pda variable from pda_init() Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
tj: s/irqcount/irq_count/ Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda is still referenced in the file * s/oldrsp/old_rsp/ Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Also clean up PER_CPU_VAR usage in xen-asm_64.S tj: * remove now unused stack_thread_info() * s/kernelstack/kernel_stack/ * added FIXME comment in xen-asm_64.S Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Brian Gerst authored
tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA for voyager. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-