- Sep 05, 2008
-
-
Jeremy Kerr authored
We currently have a race for a free SPE. With one thread doing a spu_yield(), and another doing a spu_activate(): thread 1 thread 2 spu_yield(oldctx) spu_activate(ctx) __spu_deactivate(oldctx) spu_unschedule(oldctx, spu) spu->alloc_state = SPU_FREE spu = spu_get_idle(ctx) - searches for a SPE in state SPU_FREE, gets the context just freed by thread 1 spu_schedule(ctx, spu) spu->alloc_state = SPU_USED spu_schedule(newctx, spu) - assumes spu is still free - tries to schedule context on already-used spu This change introduces a 'free_spu' flag to spu_unschedule, to indicate whether or not the function should free the spu after descheduling the context. We only set this flag if we're not going to re-schedule another context on this SPU. Add a comment to document this behaviour. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Jeremy Kerr authored
Commit 8d5636fb introduced a reference count on SPU contexts during find_victim, but this may cause a leak in the reference count if we later find a better contender for a context to unschedule. Change the reference to after we've found our victim context, so we don't do the extra get_spu_context(). Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Aug 19, 2008
-
-
Ilpo Järvinen authored
Signed-off-by:
Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Aug 14, 2008
-
-
Jeremy Kerr authored
Based on an original patch from Christoph Hellwig <hch@lst.de>. Currently, there is a possible reference-after-free in the spusched code - contexts may be freed after we have released their state_mutex in spusched_tick and find_victim. This change takes a reference to the context before releasing the mutex, so that the context doesn't get destroyed. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Aug 13, 2008
-
-
Jeremy Kerr authored
Currently, spu_run ignores the npc argument for contexts created with SPU_CREATE_NOSCHED. While this is correct for isolated contexts, there's no need to enforce the npc restriction on non-isolated NOSCHED contexts. This means that NOSCHED contexts can only ever run with an entry point of 0x0. This change to spu_run_init allows setting of the npc (and, while we're at it, the privcntl) for non-isolated NOSCHED contexts. This allows us to run NOSCHED contexts from any entry point. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Jul 30, 2008
-
-
Thomas Renninger authored
Ingo Molnar provided a fix to not call _PPC at processor driver initialization time in "[PATCH] ACPI: fix cpufreq regression" (git commit e4233dec) But it can still happen that _PPC is called at processor driver initialization time. This patch should make sure that this is not possible anymore. Signed-off-by:
Thomas Renninger <trenn@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: <stable@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 26, 2008
-
-
Alexey Dobriyan authored
Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Acked-by:
Pekka Enberg <penberg@cs.helsinki.fi> Acked-by:
Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
FUJITA Tomonori authored
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423 ). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 25, 2008
-
-
Robert Jennings authored
To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build*_pSeriesLP functions need to change to indicate failure. * all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by:
Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by:
Olof Johansson <olof@lixom.net> Acked-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mark Nelson authored
At the moment the fixed mapping is by default strongly ordered (the iommu_fixed=weak boot option must be used to make the fixed mapping weakly ordered). If we're on a setup where the southbridge is being used in endpoint mode (triblade and CAB boards) the default should be a weakly ordered fixed mapping. This adds a check so that if a node of type pcie-endpoint can be found in the device tree the fixed mapping is set to be weak by default (but can be overridden using iommu_fixed=strong). Signed-off-by:
Mark Nelson <markn@au1.ibm.com> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Jul 24, 2008
-
-
Benjamin Herrenschmidt authored
This uses the new vm_ops->access to allow gdb to access the SPU local store. We currently prevent access to problem state registers, this can be done later if really needed but it's safer not to. [akpm@linux-foundation.org: fix typo] Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Rik van Riel <riel@redhat.com> Cc: Dave Airlie <airlied@linux.ie> Cc: Hugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Andre Detsch authored
This patch adjusts the placement of a reference context from a spu affinity chain. The reference context can now be placed only on nodes that have enough spus not intended to be used by another gang (already running on the node). Signed-off-by:
Andre Detsch <adetsch@br.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Andre Detsch authored
Currenlt,, it is possible to lock aff_mutex and cbe_spu_info[n].list_mutex in different orders, allowing a deadlock to occur. With this change, aff_mutex is not taken within a list_mutex critical section anymore. Signed-off-by:
Andre Detsch <adetsch@br.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Jul 22, 2008
-
-
Milton Miller authored
kcalloc is supposed to be called with the count as its first argument and the element size as the second. Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Andi Kleen authored
This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mark Nelson authored
Introduce a new dma attriblue DMA_ATTR_WEAK_ORDERING to use weak ordering on DMA mappings in the Cell processor. Add the code to the Cell's IOMMU implementation to use this code. Dynamic mappings can be weakly or strongly ordered on an individual basis but the fixed mapping has to be either completely strong or completely weak. This is currently decided by a kernel boot option (pass iommu_fixed=weak for a weakly ordered fixed linear mapping, strongly ordered is the default). Signed-off-by:
Mark Nelson <markn@au1.ibm.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mark Nelson authored
Update iommu_alloc() to take the struct dma_attrs and pass them on to tce_build(). This change propagates down to the tce_build functions of all the platforms. Signed-off-by:
Mark Nelson <markn@au1.ibm.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Christian Krafft authored
This patch adds support for the power button on future IBM cell blades. It actually doesn't shut down the machine. Instead it exposes an input device /dev/input/event0 to userspace which sends KEY_POWER if power button has been pressed. haldaemon actually recognizes the button, so a plattform independent acpid replacement should handle it correctly. Signed-off-by:
Christian Krafft <krafft@de.ibm.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Christian Krafft authored
This patch adds a config option for the sysreset_hack used for IBM Cell blades. The code is moves from pervasive.c into ras.c and gets it's own init method. Signed-off-by:
Christian Krafft <krafft@de.ibm.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Christian Krafft authored
This patch adds a cpufreq governor that takes the number of running spus into account. It's very similar to the ondemand governor, but not as complex. Instead of hacking spu load into the ondemand governor it might be easier to have cpufreq accepting multiple governors per cpu in future. Don't know if this is the right way, but it would keep the governors simple. Signed-off-by:
Christian Krafft <krafft@de.ibm.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Dave Jones <davej@redhat.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Jul 15, 2008
-
-
Dave Kleikamp authored
It is okay for both _PAGE_GUARDED and _PAGE_COHERENT (G and M) to be set in the same pte. In fact, even if that were not the case, there doesn't seem to be any place where G is set without also setting I (_PAGE_NO_CACHE), so the test for I is sufficient as a condition to clear _PAGE_COHERENT when filling the hash table. Signed-off-by:
Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Jul 09, 2008
-
-
Mark Nelson authored
Make cell_dma_dev_setup_iommu() return a pointer to the struct iommu_table (or NULL if no table can be found) rather than putting this pointer into dev->archdata.dma_data (let the caller do that), and rename this function to cell_get_iommu_table() to reflect this change. This will allow us to get the iommu table for a device that doesn't have the table in the archdata. Signed-off-by:
Mark Nelson <markn@au1.ibm.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Maxim Shchetynin authored
As nr_active counter includes also spus waiting for syscalls to return we need a seperate counter that only counts spus that are currently running on spu side. This counter shall be used by a cpufreq governor that targets a frequency dependent from the number of running spus. Signed-off-by:
Christian Krafft <krafft@de.ibm.com> Acked-by:
Jeremy Kerr <jk@ozlabs.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Jeremy Kerr authored
Currently, the .ctx debug file in spu context directories is always present. We'd prefer to prevent users from relying on this file, so add a "debug" mount option to spufs. The .ctx file will only be added to the context directories when this option is present. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Jeremy Kerr authored
Populate the size member of a few context files. Leave out files that have different semantics with read vs mmap, or contain a variable-length hex string. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Jeremy Kerr authored
Currently, spufs never specifies the i_size for the files in context directories, so stat() always reports 0-byte files. This change adds allows the spufs_dir_(nosched_)contents arrays to specify a file size. This allows stat() to report correct file sizes, and makes SEEK_END work. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Jeremy Kerr authored
Use a set of #defines for the size of context mappings, instead of magic numbers. Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Luke Browning authored
An spu context shouldn't get an extra tick if the time slice code couldn't find something else to run. This means contexts that are not within spu_run (ie, SPU_SCHED_SPU_RUN is cleared) will not receive extra ticks while we have no other contexts waiting. Signed-off-by:
Luke Browning <lukebrowning@us.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Luke Browning authored
Add a ctxt file to spufs that shows spu context information that is used in scheduling. This info can be used for debugging spufs scheduler issues, and to isolate between application and spufs problems as it shows a lot of state such as priorities and dispatch counts. This file contains internal spufs state and is subject to change at any time, and therefore no applications should depend on it. The file is intended for the use of spufs kernel developers. Signed-off-by:
Luke Browning <lukebrowning@us.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Jun 30, 2008
-
-
Arnd Bergmann authored
We need to disable ptcal before starting a new kernel after a crash, in order to avoid overwriting data in the kdump kernel. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Jeremy Kerr <jk@ozlabs.org> Signed-off-by:
Paul Mackerras <paulus@samba.org>
-
Nicholas Piggin authored
Signed-off-by:
Nick Piggin <npiggin@suse.de> Acked-by:
Jeremy Kerr <jk@ozlabs.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Paul Mackerras <paulus@samba.org>
-
- Jun 26, 2008
-
-
Jens Axboe authored
This converts ppc to use the new helpers for smp_call_function() and friends, and adds support for smp_call_function_single(). ppc loses the timeout functionality of smp_call_function_mask() with this change, as the generic code does not provide that. Acked-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- Jun 16, 2008
-
-
Luke Browning authored
There is a delay in the transition to the stopped state for class 2 interrupts. In some cases, the controlling thread detects the state of the spu as running, and goes back to sleep resulting in a hung application as the event is missed. This change detects the stop condition and re-generates the wakeup event after a context save. Signed-off-by:
Luke Browning <lukebrowning@us.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Luke Browning authored
Time slicing can occur at the same time as spu exception handling resulting in the wakeup of the wrong thread. This change uses the the spu's register_lock to enforce synchronization between bind/unbind and spu exception handling so that they are mutually exclusive. Signed-off-by:
Luke Browning <lukebrowning@us.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Luke Browning authored
According to the CBEA, the SPU dsisr is not updated for class 0 exceptions. spu_stopped() is testing the dsisr that was passed to it from the class 0 exception handler, so we return a false positive here. This patch cleans up the interrupt handler and erroneous tests in spu_stopped. It also removes the fields from the csa since it is not needed to process class 0 events. Signed-off-by:
Luke Browning <lukebrowning@us.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
Luke Browning authored
If the spu is stopping (ie, the SPU_STATUS_RUNNING bit is still set), re-read the register to get the final stopped value. Signed-off-by:
Luke Browning <lukebrowning@us.ibm.com> Signed-off-by:
Jeremy Kerr <jk@ozlabs.org>
-
- Jun 09, 2008
-
-
Michael Ellerman authored
When I changed irq_alloc_host() to take an of_node (52964f87: "Add an optional device_node pointer to the irq_host"), I botched the reference counting semantics. Stephen pointed out that it's irq_alloc_host()'s business if it needs to take an additional reference to the device_node, the caller shouldn't need to care. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Paul Mackerras <paulus@samba.org>
-
Michael Ellerman authored
If we do the call to irq_of_parse_and_map() first, then we don't need to worry about freeing the irq_host. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Paul Mackerras <paulus@samba.org>
-
- Jun 04, 2008
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 23, 2008
-
-
Michael Ellerman authored
This adds some debugging code to the Axon MSI driver. It creates a file per MSIC in /sys/kernel/debug/powerpc, which allows the user to trigger a fake MSI interrupt by writing to the file. This can be used to test some of the MSI generation path. In particular, that the MSIC recognises a write to the MSI address, generates an interrupt and writes the MSI packet into the ring buffer. All the code is inside #ifdef DEBUG so it causes no harm unless it's enabled. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Paul Mackerras <paulus@samba.org>
-