- Aug 24, 2010
-
-
Anton Blanchard authored
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it into the callers. To avoid a mess of circular includes I didn't add it as an inline function. Signed-off-by:
Anton Blanchard <anton@samba.org> Acked-by:
Olof Johansson <olof@lixom.net> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
The code is wrapped in an #if 0, but it's wrong so we may as well fix it. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This makes the 64-bit kernel use 64-bit signed integers for the counter (effectively supporting 32-bit of active count in the semaphore), thus avoiding things like overflow of the mmap_sem if you use a really crazy number of threads Note: Ideally the type in the structure should be atomic_long_t rather than "long". However, there's some nasty issues with that. It needs to be initialized statically -and- lib/rwsem.c does things like sem->count = RWSEM_UNLOCKED_VALUE; Now, if you mix in the fact that atomic_* types are actually structures with one member and note typedefs of a scalar, it makes its really nasty. So I stuck to what we did before using a long and casts for now. Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Aug 14, 2010
-
-
Sam Ravnborg authored
unifdef-y and header-y have same semantic, so drop unifdef-y Signed-off-by:
Sam Ravnborg <sam@ravnborg.org>
-
- Aug 11, 2010
-
-
FUJITA Tomonori authored
Architectures implement dma_is_consistent() in different ways (some misinterpret the definition of API in DMA-API.txt). So it hasn't been so useful for drivers. We have only one user of the API in tree. Unlikely out-of-tree drivers use the API. Even if we fix dma_is_consistent() in some architectures, it doesn't look useful at all. It was invented long ago for some old systems that can't allocate coherent memory at all. It's better to export only APIs that are definitely necessary for drivers. Let's remove this API. Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
FUJITA Tomonori authored
dma_get_cache_alignment returns the minimum DMA alignment. Architectures defines it as ARCH_DMA_MINALIGN (formally ARCH_KMALLOC_MINALIGN). So we can unify dma_get_cache_alignment implementations. Note that some architectures implement dma_get_cache_alignment wrongly. dma_get_cache_alignment() should return the minimum DMA alignment. So fully-coherent architectures should return 1. This patch also fixes this issue. Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: <linux-arch@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
FUJITA Tomonori authored
Now each architecture has the own dma_get_cache_alignment implementation. dma_get_cache_alignment returns the minimum DMA alignment. Architectures define it as ARCH_KMALLOC_MINALIGN (it's used to make sure that malloc'ed buffer is DMA-safe; the buffer doesn't share a cache with the others). So we can unify dma_get_cache_alignment implementations. This patch: dma_get_cache_alignment() needs to know if an architecture defines ARCH_KMALLOC_MINALIGN or not (needs to know if architecture has DMA alignment restriction). However, slab.h define ARCH_KMALLOC_MINALIGN if architectures doesn't define it. Let's rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN. ARCH_KMALLOC_MINALIGN is used only in the internals of slab/slob/slub (except for crypto). Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: <linux-arch@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Aug 10, 2010
-
-
hyc@symas.com authored
This patch is against the 2.6.34 source. Paraphrased from the 1989 BSD patch by David Borman @ cray.com: These are the changes needed for the kernel to support LINEMODE in the server. There is a new bit in the termios local flag word, EXTPROC. When this bit is set, several aspects of the terminal driver are disabled. Input line editing, character echo, and mapping of signals are all disabled. This allows the telnetd to turn off these functions when in linemode, but still keep track of what state the user wants the terminal to be in. New ioctl: TIOCSIG Generate a signal to processes in the current process group of the pty. There is a new mode for packet driver, the TIOCPKT_IOCTL bit. When packet mode is turned on in the pty, and the EXTPROC bit is set, then whenever the state of the pty is changed, the next read on the master side of the pty will have the TIOCPKT_IOCTL bit set. This allows the process on the server side of the pty to know when the state of the terminal has changed; it can then issue the appropriate ioctl to retrieve the new state. Since the original BSD patches accompanied the source code for telnet I've left that reference here, but obviously the feature is useful for any remote terminal protocol, including ssh. The corresponding feature has existed in the BSD tty driver since 1989. For historical reference, a good copy of the relevant files can be found here: http://anonsvn.mit.edu/viewvc/krb5/trunk/src/appl/telnet/?pathrev=17741 Signed-off-by:
Howard Chu <hyc@symas.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Cesar Eduardo Barros authored
kunmap_atomic() is currently at level -4 on Rusty's "Hard To Misuse" list[1] ("Follow common convention and you'll get it wrong"), except in some architectures when CONFIG_DEBUG_HIGHMEM is set[2][3]. kunmap() takes a pointer to a struct page; kunmap_atomic(), however, takes takes a pointer to within the page itself. This seems to once in a while trip people up (the convention they are following is the one from kunmap()). Make it much harder to misuse, by moving it to level 9 on Rusty's list[4] ("The compiler/linker won't let you get it wrong"). This is done by refusing to build if the type of its first argument is a pointer to a struct page. The real kunmap_atomic() is renamed to kunmap_atomic_notypecheck() (which is what you would call in case for some strange reason calling it with a pointer to a struct page is not incorrect in your code). The previous version of this patch was compile tested on x86-64. [1] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html [2] In these cases, it is at level 5, "Do it right or it will always break at runtime." [3] At least mips and powerpc look very similar, and sparc also seems to share a common ancestor with both; there seems to be quite some degree of copy-and-paste coding here. The include/asm/highmem.h file for these three archs mention x86 CPUs at its top. [4] http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html [5] As an aside, could someone tell me why mn10300 uses unsigned long as the first parameter of kunmap_atomic() instead of void *? Signed-off-by:
Cesar Eduardo Barros <cesarb@cesarb.net> Cc: Russell King <linux@arm.linux.org.uk> (arch/arm) Cc: Ralf Baechle <ralf@linux-mips.org> (arch/mips) Cc: David Howells <dhowells@redhat.com> (arch/frv, arch/mn10300) Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> (arch/mn10300) Cc: Kyle McMartin <kyle@mcmartin.ca> (arch/parisc) Cc: Helge Deller <deller@gmx.de> (arch/parisc) Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> (arch/parisc) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> (arch/powerpc) Cc: Paul Mackerras <paulus@samba.org> (arch/powerpc) Cc: "David S. Miller" <davem@davemloft.net> (arch/sparc) Cc: Thomas Gleixner <tglx@linutronix.de> (arch/x86) Cc: Ingo Molnar <mingo@redhat.com> (arch/x86) Cc: "H. Peter Anvin" <hpa@zytor.com> (arch/x86) Cc: Arnd Bergmann <arnd@arndb.de> (include/asm-generic) Cc: Rusty Russell <rusty@rustcorp.com.au> ("Hard To Misuse" list) Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Aug 07, 2010
-
-
FUJITA Tomonori authored
Architectures don't need to define ISA_DMA_THRESHOLD anymore. Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by:
James Bottomley <James.Bottomley@suse.de> Acked-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Eric Millbrandt authored
Work around a silicon bug in the ac97 reset functionality of the mpc5200(b). The implementation of the ac97 "cold" reset is flawed. If the sync and output lines are high when reset is asserted the attached ac97 device may go into test mode. Avoid this by reconfiguring the psc to gpio mode and generating the reset manually. From MPC5200B User's Manual: "Some AC97 devices goes to a test mode, if the Sync line is high during the Res line is low (reset phase). To avoid this behavior the Sync line must be also forced to zero during the reset phase. To do that, the pin muxing should switch to GPIO mode and the GPIO control register should be used to control the output lines." Signed-off-by:
Eric Millbrandt <emillbrandt@dekaresearch.com> Signed-off-by:
Grant Likely <grant.likely@secretlab.ca>
-
- Aug 01, 2010
-
-
Anatolij Gustschin authored
MPC5121 DIU configuration/setup as initialized by the boot loader currently will get lost while booting Linux. As a result displaying the boot splash is not possible through the boot process. To prevent this we reserve configured DIU frame buffer address range while booting and preserve AOI descriptor and gamma table so that DIU continues displaying through the whole boot process. On first open from user space DIU frame buffer driver releases the reserved frame buffer area and continues to operate as usual. Signed-off-by:
John Rigby <jcrigby@gmail.com> Signed-off-by:
Anatolij Gustschin <agust@denx.de> Acked-by:
Timur Tabi <timur@freescale.com> Signed-off-by:
Grant Likely <grant.likely@secretlab.ca>
-
Joerg Roedel authored
This patch converts unnecessary divide and modulo operations in the KVM large page related code into logical operations. This allows to convert gfn_t to u64 while not breaking 32 bit builds. Signed-off-by:
Joerg Roedel <joerg.roedel@amd.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Alexander Graf authored
We just introduced generic functions to handle shadow pages on PPC. This patch makes the respective backends make use of them, getting rid of a lot of duplicate code along the way. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Alexander Graf authored
Initially we had to search for pte entries to invalidate them. Since the logic has improved since then, we can just get rid of the search function. Signed-off-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Avi Kivity <avi@redhat.com>
-
Grant Likely authored
This patch moves the declaration of of_get_address(), of_get_pci_address(), and of_pci_address_to_resource() out of arch code and into the common linux/of_address header file. This patch also fixes some of the asm/prom.h ordering issues. It still includes some header files that it ideally shouldn't be, but at least the ordering is consistent now so that of_* overrides work. Signed-off-by:
Grant Likely <grant.likely@secretlab.ca>
-
Andreas Schwab authored
Instead of instantiating a whole thread_struct on the stack use only the required parts of it. Signed-off-by:
Andreas Schwab <schwab@linux-m68k.org> Tested-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
- Jul 31, 2010
-
-
Matt Evans authored
With dynamic PACAs, the kexecing CPU's PACA won't lie within the kernel static data and there is a chance that something may stomp it when preparing to kexec. This patch switches this final CPU to a static PACA just before we pull the switch. Signed-off-by:
Matt Evans <matt@ozlabs.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Jul 30, 2010
-
-
Grant Likely authored
of_node_to_nid() is only relevant in a few architectures. Don't force everyone to implement it anyway. Signed-off-by:
Grant Likely <grant.likely@secretlab.ca>
-
- Jul 28, 2010
-
-
Paul Mackerras authored
Since the decrementer and timekeeping code was moved over to using the generic clockevents and timekeeping infrastructure, several variables and functions have been obsolete and effectively unused. This deletes them. In particular, wakeup_decrementer() is no longer needed since the generic code reprograms the decrementer as part of the process of resuming the timekeeping code, which happens during sysdev resume. Thus the wakeup_decrementer calls in the suspend_enter methods for 52xx platforms have been removed. The call in the powermac cpu frequency change code has been replaced by set_dec(1), which will cause a timer interrupt as soon as interrupts are enabled, and the generic code will then reprogram the decrementer with the correct value. This also simplifies the generic_suspend_en/disable_irqs functions and makes them static since they are not referenced outside time.c. The preempt_enable/disable calls are removed because the generic code has disabled all but the boot cpu at the point where these functions are called, so we can't be moved to another cpu. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Paul Mackerras authored
Currently it is possible for userspace to see the result of gettimeofday() going backwards by 1 microsecond, assuming that userspace is using the gettimeofday() in the VDSO. The VDSO gettimeofday() algorithm computes the time in "xsecs", which are units of 2^-20 seconds, or approximately 0.954 microseconds, using the algorithm now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec and then converts the time in xsecs to seconds and microseconds. The kernel updates the tb_orig_stamp and stamp_xsec values every tick in update_vsyscall(). If the length of the tick is not an integer number of xsecs, then some precision is lost in converting the current time to xsecs. For example, with CONFIG_HZ=1000, the tick is 1ms long, which is 1048.576 xsecs. That means that stamp_xsec will advance by either 1048 or 1049 on each tick. With the right conditions, it is possible for userspace to get (timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is slightly late in updating the vdso_datapage, and then for stamp_xsec to advance by 1048 when the kernel does update it, and for userspace to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to integer truncation. The result is that time appears to go backwards by 1 microsecond. To fix this we change the VDSO gettimeofday to use a new field in the VDSO datapage which stores the nanoseconds part of the time as a fractional number of seconds in a 0.32 binary fraction format. (Or put another way, as a 32-bit number in units of 0.23283 ns.) This is convenient because we can use the mulhwu instruction to convert it to either microseconds or nanoseconds. Since it turns out that computing the time of day using this new field is simpler than either using stamp_xsec (as gettimeofday does) or stamp_xtime.tv_nsec (as clock_gettime does), this converts both gettimeofday and clock_gettime to use the new field. The existing __do_get_tspec function is converted to use the new field and take a parameter in r7 that indicates the desired resolution, 1,000,000 for microseconds or 1,000,000,000 for nanoseconds. The __do_get_xsec function is then unused and is deleted. The new algorithm is now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs + (stamp_xtime_seconds << 32) + stamp_sec_fraction with 'now' in units of 2^-32 seconds. That is then converted to seconds and either microseconds or nanoseconds with seconds = now >> 32 partseconds = ((now & 0xffffffff) * resolution) >> 32 The 32-bit VDSO code also makes a further simplification: it ignores the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary fraction. Doing so gets rid of 4 multiply instructions. Assuming a timebase frequency of 1GHz or less and an update interval of no more than 10ms, the upper 32 bits of tb_to_xs will be at least 4503599, so the error from ignoring the low 32 bits will be at most 2.2ns, which is more than an order of magnitude less than the time taken to do gettimeofday or clock_gettime on our fastest processors, so there is no possibility of seeing inconsistent values due to this. This also moves update_gtod() down next to its only caller, and makes update_vsyscall use the time passed in via the wall_time argument rather than accessing xtime directly. At present, wall_time always points to xtime, but that could change in future. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Jul 24, 2010
-
-
Grant Likely authored
of_device is currently just an #define alias to platform_device until it gets removed entirely. This patch removes references to it from the include directories and the core drivers/of code. Signed-off-by:
Grant Likely <grant.likely@secretlab.ca> Acked-by:
David S. Miller <davem@davemloft.net>
-
Grant Likely authored
It is mostly unused now. Sparc has a few defines left in it, but they can be moved to other headers. Removing this header means that new architectures adding CONFIG_OF support don't need to also add this header file. Signed-off-by:
Grant Likely <grant.likely@secretlab.ca> Acked-by:
David S. Miller <davem@davemloft.net>
-
Grant Likely authored
Only thing left in it is of_instantiate_rtc() which can be moved to asm/prom.h on PowerPC and is unused in microblaze. Signed-off-by:
Grant Likely <grant.likely@secretlab.ca> Acked-by:
David S. Miller <davem@davemloft.net>
-
- Jul 23, 2010
-
-
Benjamin Herrenschmidt authored
This adds some debug output to our MMU hash code to print out some useful debug data if the hypervisor refuses the insertion (which should normally never happen). Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> ---
-
- Jul 22, 2010
-
-
Kumar Gala authored
The KEXEC_*_MEMORY_LIMITs are inclusive addresses. We define them as 2Gs as that is what we allow mapping via TLBs. However, this should be 2G - 1 to be inclusive, otherwise if we have >2G of memory in a system we fail to boot properly via kexec. Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
- Jul 19, 2010
-
-
Grant Likely authored
Signed-off-by:
Grant Likely <grant.likely@secretlab.ca> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Jul 14, 2010
-
-
Yinghai Lu authored
via following scripts FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N | sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by:
Ingo Molnar <mingo@elte.hu> Acked-by:
"H. Peter Anvin" <hpa@zytor.com> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Yinghai Lu <yinghai@kernel.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Use the MMU config registers to scan for available direct and indirect page sizes and print out the result. Will be needed for future hugetlbfs implementation. Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We use a similar technique to ppc32: We set a thread local flag to indicate that we are about to enter or have entered the stop state, and have fixup code in the async interrupt entry code that reacts to this flag to make us return to a different location (sets NIP to LINK in our case). Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> -- v2. Fix lockdep bug Re-mask interrupts when coming back from idle
-
- Jul 11, 2010
-
-
Anton Vorontsov authored
This saves runtime memory and fixes lots of sparse warnings like this: CHECK arch/powerpc/sysdev/micropatch.c arch/powerpc/sysdev/micropatch.c:27:6: warning: symbol 'patch_2000' was not declared. Should it be static? arch/powerpc/sysdev/micropatch.c:146:6: warning: symbol 'patch_2f00' was not declared. Should it be static? ... Signed-off-by:
Anton Vorontsov <avorontsov@mvista.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
Anton Vorontsov authored
spi_t was removed in commit 644b2a68 ("powerpc/cpm: Remove SPI defines and spi structs"), the commit assumed that spi_t isn't used anywhere outside of the spi_mpc8xxx driver. But it appears that the struct is needed for micropatch code. So, let's reintroduce the struct. Fixes the following build issue: CC arch/powerpc/sysdev/micropatch.o micropatch.c: In function 'cpm_load_patch': micropatch.c:629: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token micropatch.c:629: error: 'spp' undeclared (first use in this function) micropatch.c:629: error: (Each undeclared identifier is reported only once micropatch.c:629: error: for each function it appears in.) Reported-by:
LEROY Christophe <christophe.leroy@c-s.fr> Reported-by:
Tony Breeds <tony@bakeyournoodle.com> Cc: <stable@kernel.org> [ .33, .34 ] Signed-off-by:
Anton Vorontsov <avorontsov@mvista.com> Signed-off-by:
Kumar Gala <galak@kernel.crashing.org>
-
- Jul 09, 2010
-
-
Michael Ellerman authored
If we are soft disabled and receive a doorbell exception we don't process it immediately. This means we need to check on the way out of irq restore if there are any doorbell exceptions to process. The problem is at that point we don't know what our regs are, and that in turn makes xmon unhappy. To workaround the problem, instead of checking for and processing doorbells, we check for any doorbells and if there were any we send ourselves another. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The doorbells use the content of the PIR register to match messages from other CPUs. This may or may not be the same as our linux CPU number, so using that as the "target" is no right. Instead, we sample the PIR register at boot on every processor and use that value subsequently when sending IPIs. We also use a per-cpu message mask rather than a global array which should limit cache line contention. Note: We could use the CPU number in the device-tree instead of the PIR register, as they are supposed to be equivalent. This might prove useful if doorbells are to be used to kick CPUs out of FW at boot time, thus before we can sample the PIR. This is however not the case now and using the PIR just works. Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Our handling of debug interrupts on Book3E 64-bit is not quite the way it should be just yet. This is a workaround to let gdb work at least for now. We ensure that when context switching, we set the appropriate DBCR0 value for the new task. We also make sure that we turn off MSR[DE] within the kernel, and set it as part of the bits that get set when going back to userspace. In the long run, we will probably set the userspace DBCR0 on the exception exit code path and ensure we have some proper kernel value to set on the way into the kernel, a bit like ppc32 does, but that will take more work. Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
Form 1 affinity allows multiple entries in ibm,associativity-reference-points which represent affinity domains in decreasing order of importance. The Linux concept of a node is always the first entry, but using the other values as an input to node_distance() allows the memory allocator to make better decisions on which node to go first when local memory has been exhausted. We keep things simple and create an array indexed by NUMA node, capped at 4 entries. Each time we lookup an associativity property we initialise the array which is overkill, but since we should only hit this path during boot it didn't seem worth adding a per node valid bit. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mark Nelson authored
The RAS code has a #define, RAS_VECTOR_OFFSET, that's used in the check-exception RTAS call for the vector offset of the exception. We'll be using this same vector offset for the upcoming IO Event interrupts code (0x500) so let's move it to include/asm/rtas.h and call it RTAS_VECTOR_EXTERNAL_INTERRUPT. Signed-off-by:
Mark Nelson <markn@au1.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
Now we dynamically allocate the paca array, it takes an extra load whenever we want to access another cpu's paca. One place we do that a lot is per cpu variables. A simple example: DEFINE_PER_CPU(unsigned long, vara); unsigned long test4(int cpu) { return per_cpu(vara, cpu); } This takes 4 loads, 5 if you include the actual load of the per cpu variable: ld r11,-32760(r30) # load address of paca pointer ld r9,-32768(r30) # load link address of percpu variable sldi r3,r29,9 # get offset into paca (each entry is 512 bytes) ld r0,0(r11) # load paca pointer add r3,r0,r3 # paca + offset ld r11,64(r3) # load paca[cpu].data_offset ldx r3,r9,r11 # load per cpu variable If we remove the ppc64 specific per_cpu_offset(), we get the generic one which indexes into a statically allocated array. This removes one load and one add: ld r11,-32760(r30) # load address of __per_cpu_offset ld r9,-32768(r30) # load link address of percpu variable sldi r3,r29,3 # get offset into __per_cpu_offset (each entry 8 bytes) ldx r11,r11,r3 # load __per_cpu_offset[cpu] ldx r3,r9,r11 # load per cpu variable Having all the offsets in one array also helps when iterating over a per cpu variable across a number of cpus, such as in the scheduler. Before we would need to load one paca cacheline when calculating each per cpu offset. Now we have 16 (128 / sizeof(long)) per cpu offsets in each cacheline. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Denis Kirjanov authored
Fix smatch warning: constant 0x8000000000000000 is so big it is unsigned long Signed-off-by:
Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-