- Apr 19, 2019
-
-
Jordan Crouse authored
99.999% of the time during normal operation the GMU is responsible for power and clock control on the GX domain and the CPU remains blissfully unaware. However, there is one situation where the CPU needs to get involved: The power sequencing rules dictate that the GX needs to be turned off before the CX so that the CX can be turned on before the GX during power up. During normal operation when the CPU is taking down the CX domain a stop command is sent to the GMU which turns off the GX domain and then the CPU handles the CX domain. But if the GMU happened to be unresponsive while the GX domain was left then the CPU will need to step in and turn off the GX domain before resetting the CX and rebooting the GMU. This unfortunately means that the CPU needs to be marginally aware of the GX domain even though it is expected to usually keep its hands off. To support this we create a semi-disabled GX power domain that does nothing to the hardware on power up but tries to shut it down normally on power down. In this method the reference counting is correct and we can step in with the pm_runtime_put() at the right time during the failure path. This patch sets up the connection to the GX power domain and does the magic to "enable" and disable it at the right points. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
Jordan Crouse authored
The GMU code currently has some misguided code to try to work around a hardware quirk that requires the power domains on the GPU be collapsed in a certain order. Upcoming patches will do this the right way so get rid of the unused and unwanted regulator code. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
Rob Clark authored
For KHR_robustness, userspace wants to know two things, the count of GPU faults globally, and the count of faults attributed to a given context. This patch providees the former, and the next patch provides the latter. Signed-off-by:
Rob Clark <robdclark@chromium.org> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org>
-
Rob Clark authored
For now it always returns '0' (false), but once the iommu work is in place to enable per-process pagetables we can update the value returned. Userspace needs to know this to make an informed decision about exposing KHR_robustness. Signed-off-by:
Rob Clark <robdclark@chromium.org> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org>
-
- Apr 18, 2019
-
-
Wen Yang authored
The call to of_get_child_by_name returns a node pointer with refcount incremented thus it must be explicitly decremented after the last usage. Detected by coccinelle with the following warnings: drivers/gpu/drm/msm/adreno/a5xx_gpu.c:57:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function. drivers/gpu/drm/msm/adreno/a5xx_gpu.c:66:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function. drivers/gpu/drm/msm/adreno/a5xx_gpu.c:118:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function. drivers/gpu/drm/msm/adreno/a5xx_gpu.c:57:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function. drivers/gpu/drm/msm/adreno/a5xx_gpu.c:66:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function. drivers/gpu/drm/msm/adreno/a5xx_gpu.c:118:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function. Signed-off-by:
Wen Yang <wen.yang99@zte.com.cn> Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Mamta Shukla <mamtashukla555@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sharat Masetty <smasetty@codeaurora.org> Cc: linux-arm-msm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org (open list) Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com> Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
Douglas Anderson authored
The patch ("OPP: Add support for parsing the 'opp-level' property") adds an API enabling a cleaner way to read the opp-level. Let's use the new API. Signed-off-by:
Douglas Anderson <dianders@chromium.org> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com> Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
- Feb 19, 2019
-
-
Jordan Crouse authored
Try to get the interconnect path for the GPU and vote for the maximum bandwidth to support all frequencies. This is needed for performance. Later we will want to scale the bandwidth based on the frequency to also optimize for power but that will require some device tree infrastructure that does not yet exist. v6: use icc_set_bw() instead of icc_set() v5: Remove hardcoded interconnect name and just use the default v4: Don't use a port string at all to skip the need for names in the DT v3: Use macros and change port string per Georgi Djakov Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Acked-by:
Rob Clark <robdclark@gmail.com> Reviewed-by:
Evan Green <evgreen@chromium.org> Signed-off-by:
Georgi Djakov <georgi.djakov@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Jan 29, 2019
-
-
Douglas Anderson authored
The bindings for Qualcomm opp levels changed after being Acked but before landing. Thus the code in the GPU driver that was relying on the old bindings is now broken. Let's change the code to match the new bindings by adjusting the old string 'qcom,level' to the new string 'opp-level'. See the patch ("dt-bindings: opp: Introduce opp-level bindings"). NOTE: we will do additional cleanup to totally remove the string from the code and use the new dev_pm_opp_get_level() but we'll do it in a future patch. This will facilitate getting the important code fix in sooner without having to deal with cross-maintainer dependencies. This patch needs to land before the patch ("arm64: dts: sdm845: Add gpu and gmu device nodes") since if a tree contains the device tree patch but not this one you'll get a crash at bootup. Fixes: 4b565ca5 ("drm/msm: Add A6XX device support") Signed-off-by:
Douglas Anderson <dianders@chromium.org> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Every GPU core only has one interrupt so there isn't any value in looking up the interrupt by name. Remove the name (which is legacy anyway) and use platform_get_irq() instead. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by:
Douglas Anderson <dianders@chromium.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- Jan 24, 2019
-
-
Douglas Anderson authored
The bindings for Qualcomm opp levels changed after being Acked but before landing. Thus the code in the GPU driver that was relying on the old bindings is now broken. Let's change the code to match the new bindings by adjusting the old string 'qcom,level' to the new string 'opp-level'. See the patch ("dt-bindings: opp: Introduce opp-level bindings"). NOTE: we will do additional cleanup to totally remove the string from the code and use the new dev_pm_opp_get_level() but we'll do it in a future patch. This will facilitate getting the important code fix in sooner without having to deal with cross-maintainer dependencies. This patch needs to land before the patch ("arm64: dts: sdm845: Add gpu and gmu device nodes") since if a tree contains the device tree patch but not this one you'll get a crash at bootup. Fixes: 4b565ca5 ("drm/msm: Add A6XX device support") Signed-off-by:
Douglas Anderson <dianders@chromium.org> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Every GPU core only has one interrupt so there isn't any value in looking up the interrupt by name. Remove the name (which is legacy anyway) and use platform_get_irq() instead. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by:
Douglas Anderson <dianders@chromium.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- Dec 11, 2018
-
-
Jonathan Marek authored
This patch allows using drm/msm without qcom display hardware. It adds a amd,imageon compatible, which is used instead of qcom,adreno, but does not require a top level msm node. Signed-off-by:
Jonathan Marek <jonathan@marek.ca> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jonathan Marek authored
A2XX has its own very simple MMU. Added a msm_use_mmu() function because we can't rely on iommu_present to decide to use MMU or not. Signed-off-by:
Jonathan Marek <jonathan@marek.ca> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Add a buffer object name for the a6xx crashdumper so it can be seen with the changes introduced by 7799a98edd ("drm/msm: Add a name field for gem objects"). Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
dadb36b7ec42 ("drm/msm: Add a common function to free kernel buffer objects") missed freeing the crashdumper state for a6xx. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jonathan Marek authored
derived from the a3xx driver and tested on the following hardware: imx51-zii-rdu1 (a200 with 128kb gmem) imx53-qsrb (a200) msm8060-tenderloin (a220) Signed-off-by:
Jonathan Marek <jonathan@marek.ca> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Sharat Masetty authored
When the userspace tries to read the crashstate dump, the read side implementation in the driver currently ascii85 encodes all the binary buffers and it does this each time the read system call is called. A userspace tool like cat typically does a page by page read and the number of read calls depends on the size of the data captured by the driver. This is certainly not desirable and does not scale well with large captures. This patch encodes the buffer only once in the read path. With this there is an immediate >10X speed improvement in crashstate save time. Signed-off-by:
Sharat Masetty <smasetty@codeaurora.org> Reviewed-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
For reasons that I'm sure made perfect sense at the time we were opting to defer the iova alloc / pin on the ringbuffer until HW init time so when we moved to iova reference counting we ended up adding a reference count every time the hardware started. Not that it mattered (because the ring is always around) but it did make the debug output look odd. Allocate and pin the iova at create time instead. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
For debugging purposes it is useful to assign descriptions to buffers so that we know what they are used for. Add a field to the buffer object and use that to name the various kernel side allocations which ends up looking like like this in /d/dri/X/gem: flags id ref offset kaddr size madv name 00040000: I 0 ( 1) 00000000 0000000070b79eca 00004096 memptrs vmas: [gpu: 01000000,mapped,inuse=1] 00020000: I 0 ( 1) 00000000 0000000031ed4074 00032768 ring0 Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Add a reference count to track how many times a particular chunk of iova memory is pinned (mapped) in the iomu and add msm_gem_unpin_iova to give up references. It is important to note that msm_gem_unpin_iova replaces msm_gem_put_iova because the new implicit behavior that an assigned iova in a given vma is now valid for the life of the buffer and what we are really focusing on is the use of that iova. For now the unmappings are lazy; once the reference counts go to zero they *COULD* be unmapped dynamically but that will require an outside force such as a shrinker or mm_notifiers. For now, we're just focusing on getting the counting right and setting ourselves up to be ready for the future. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Add a new function to get and pin the iova memory in one step (basically renaming the old msm_gem_get_iova function) and switch msm_gem_get_iova() to only allocate an iova but not map it in the IOMMU. This is only currently used by msm_ioctl_gem_info() since all other users of of the iova expect that the memory be immediately available. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Buffer objects allocated with msm_gem_kernel_new() are mostly freed the same way so we can save a few lines of code with a common function. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
The a6xx GPU state allocates a LOT of memory. Add a bit of infrastructure to track the memory allocations in the GPU structure and delete them when the state is destroyed much the same way that devm works with the device model as a whole. This protects against the developer accidentally forgetting to add a kfree() to an ever growing list. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Add support for gathering and dumping the a6xx GPU state including registers, GMU registers, indexed registers, shader blocks, context clusters and debugbus. v2: Fix bugs discovered by Sharat Masetty Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
If the GPU target doesn't define a list of registers then gracefully skip capturing and/or printing them. This is used by more complex targets like 6xx that have other means of capturing register values. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
The gpu_poll_timeout() function can be useful to multiple targets so mvoe it into adreno_gpu.h from the a5xx code. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Add trace events to track the progress of a GPU submission msm_gpu_submit occurs at the beginning of the submissions, msm_gpu_submit_flush happens when the submission is put on the ringbuffer and msm_submit_flush_retired is sent when the operation is retired. To make it easier to track the operations a unique sequence number is assigned to each submission and displayed in each event output so a human or a script can easily associate the events related to a specific submission. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Jordan Crouse authored
Add infrastructure to track statistics for GPU submissions by sampling certain perfcounters before and after a submission. To store the statistics, the per-ring memptrs region is expanded to include room for up to 64 entries - this should cover a reasonable amount of inflight submissions without worrying about losing data. The target specific code inserts PM4 commands to sample the counters before and after submission and store them in the data region. The CPU can access the data after the submission retires to make sense of the statistics and communicate them to the user. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Mamta Shukla authored
Use DRM_DEV_INFO/ERROR/WARN instead of dev_info/err/debug to generate drm-formatted specific log messages so that it will be easy to differentiate in case of multiple instances of driver. Signed-off-by:
Mamta Shukla <mamtashukla555@gmail.com> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- Nov 29, 2018
-
-
Stephen Boyd authored
We need to check the call to cmd_db_read_aux_data() for the error case, so that we don't continue and use potentially uninitialized values for 'pri_count' and 'sec_count'. Otherwise, we get the following compiler warnings: drivers/gpu/drm/msm/adreno/a6xx_gmu.c: In function 'a6xx_gmu_rpmh_arc_votes_init.isra.12': drivers/gpu/drm/msm/adreno/a6xx_gmu.c:943:12: warning: 'pri_count' is used uninitialized in this function [-Wuninitialized] pri_count >>= 1; ^~~ drivers/gpu/drm/msm/adreno/a6xx_gmu.c:948:12: warning: 'sec_count' may be used uninitialized in this function [-Wmaybe-uninitialized] sec_count >>= 1; ^~~ Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Reported-by:
kbuild test robot <lkp@intel.com> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Evan Green <evgreen@chromium.org> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Rob Clark <robdclark@gmail.com> Fixes: ed3cafa7 ("soc: qcom: cmd-db: Stop memcpy()ing in cmd_db_read_aux_data()") Signed-off-by:
Stephen Boyd <swboyd@chromium.org> Reviewed-by:
Andy Gross <andy.gross@linaro.org> Acked-by:
Rob Clark <robdclark@gmail.com> Signed-off-by:
Andy Gross <andy.gross@linaro.org>
-
- Nov 14, 2018
-
-
Stephen Boyd authored
Let's change the function signature to return the pointer to memory or an error pointer on failure, and take an argument that lets us return the size of the aux data read. This way we can remove the cmd_db_read_aux_data_len() API entirely and also get rid of the memcpy operation from cmd_db to the caller. Updating the only user of this code shows that making this change allows us to remove a function and put the lookup where the user is. Cc: Mahesh Sivasubramanian <msivasub@codeaurora.org> Cc: Lina Iyer <ilina@codeaurora.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Evan Green <evgreen@chromium.org> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Rob Clark <robdclark@gmail.com> Signed-off-by:
Stephen Boyd <swboyd@chromium.org> Signed-off-by:
Andy Gross <andy.gross@linaro.org>
-
- Oct 23, 2018
-
-
Johan Hovold authored
Use the new of_get_compatible_child() helper to lookup the legacy pwrlevels child node instead of using of_find_compatible_node(), which searches the entire tree from a given start node and thus can return an unrelated (i.e. non-child) node. This also addresses a potential use-after-free (e.g. after probe deferral) as the tree-wide helper drops a reference to its first argument (i.e. the probed device's node). While at it, also fix the related child-node reference leak. Fixes: e2af8b6b ("drm/msm: gpu: Use OPP tables if we can") Cc: stable <stable@vger.kernel.org> # 4.12 Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Rob Clark <robdclark@gmail.com> Cc: David Airlie <airlied@linux.ie> Signed-off-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Rob Herring <robh@kernel.org>
-
Mamta Shukla authored
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR because its better to have inlined function rather than code-opened implementation. Signed-off-by:
Mamta Shukla <mamtashukla555@gmail.com> Reviewed-by:
Sean Paul <sean@poorly.run> Signed-off-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20181018204815.GA23390@armorer
-
- Oct 09, 2018
-
-
Sean Paul authored
This patch uses the proper do_div() macro to perform u64 division and guards against overflow if the result is too large for the unsigned long return type Fixes: a2c3c0a5 drm/msm/a6xx: Add devfreq support for a6xx Cc: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by:
Sean Paul <seanpaul@chromium.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Sean Paul authored
A small fixup I posted with my v2 patch [1] that was dropped. [1]- https://lists.freedesktop.org/archives/freedreno/2018-October/003647.html Signed-off-by:
Sean Paul <seanpaul@chromium.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- Oct 07, 2018
-
-
Jordan Crouse authored
The CP performance counter selects were accidentally marked as protected so they couldn't be written from PM4 streams. Remove the protection because user space does have an interest in setting up their own counters. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Sean Paul authored
This patch uses the proper do_div() macro to perform u64 division and guards against overflow if the result is too large for the unsigned long return type Fixes: de0a3d09 drm/msm: re-factor devfreq code Cc: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by:
Sean Paul <seanpaul@chromium.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- Oct 04, 2018
-
-
Sharat Masetty authored
Implement routines to estimate GPU busy time and fetching the current frequency for the polling interval. This is required by the devfreq framework which recommends a frequency change if needed. The driver code then tries to set this new frequency on the GPU by sending an Out Of Band(OOB) request to the GMU. Signed-off-by:
Sharat Masetty <smasetty@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@gmail.com>
-