Commits · 21af872cd8c695060dd1d045293bf21ea9156a51 · jan.koester / Linux

Dec 11, 2018

Jonathan Marek authored Nov 21, 2018



derived from the a3xx driver and tested on the following hardware:
imx51-zii-rdu1 (a200 with 128kb gmem)
imx53-qsrb (a200)
msm8060-tenderloin (a220)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

21af872c

drm/msm: Optimize adreno_show_object() · 1df4289d

Sharat Masetty authored Nov 01, 2018



When the userspace tries to read the crashstate dump, the read side
implementation in the driver currently ascii85 encodes all the binary
buffers and it does this each time the read system call is called.
A userspace tool like cat typically does a page by page read and the
number of read calls depends on the size of the data captured by the
driver. This is certainly not desirable and does not scale well with
large captures.

This patch encodes the buffer only once in the read path. With this there
is an immediate >10X speed improvement in crashstate save time.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

1df4289d

drm/msm/gpu: Map the ringbuffer in the iova at create time · 84c61275

Jordan Crouse authored Nov 07, 2018



For reasons that I'm sure made perfect sense at the time we were
opting to defer the iova alloc / pin on the ringbuffer until HW
init time so when we moved to iova reference counting we ended
up adding a reference count every time the hardware started.
Not that it mattered (because the ring is always around) but
it did make the debug output look odd. Allocate and pin the iova
at create time instead.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

84c61275

drm/msm: Add a name field for gem objects · 0815d774

Jordan Crouse authored Nov 07, 2018



For debugging purposes it is useful to assign descriptions
to buffers so that we know what they are used for. Add
a field to the buffer object and use that to name the various
kernel side allocations which ends up looking like like this
in /d/dri/X/gem:

   flags       id ref  offset   kaddr            size     madv      name
   00040000: I  0 ( 1) 00000000 0000000070b79eca 00004096           memptrs
      vmas: [gpu: 01000000,mapped,inuse=1]
   00020000: I  0 ( 1) 00000000 0000000031ed4074 00032768           ring0

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

0815d774

drm/msm: Count how many times iova memory is pinned · 7ad0e8cf

Jordan Crouse authored Nov 07, 2018



Add a reference count to track how many times a particular
chunk of iova memory is pinned (mapped) in the iomu and
add msm_gem_unpin_iova to give up references.

It is important to note that msm_gem_unpin_iova replaces
msm_gem_put_iova because the new implicit behavior
that an assigned iova in a given vma is now valid for the
life of the buffer and what we are really focusing on is
the use of that iova.

For now the unmappings are lazy; once the reference counts
go to zero they *COULD* be unmapped dynamically but that
will require an outside force such as a shrinker or
mm_notifiers.  For now, we're just focusing on getting
the counting right and setting ourselves up to be ready
for the future.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

7ad0e8cf

drm/msm: Add msm_gem_get_and_pin_iova() · 9fe041f6

Jordan Crouse authored Nov 07, 2018



Add a new function to get and pin the iova memory in one
step (basically renaming the old msm_gem_get_iova function)
and switch msm_gem_get_iova() to only allocate an iova but
not map it in the IOMMU. This is only currently used by
msm_ioctl_gem_info() since all other users of of the iova
expect that the memory be immediately available.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

9fe041f6

drm/msm: Add a common function to free kernel buffer objects · 1e29dff0

Jordan Crouse authored Nov 07, 2018



Buffer objects allocated with msm_gem_kernel_new() are mostly
freed the same way so we can save a few lines of code with a
common function.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

1e29dff0

drm/msm/a6xx: Track and manage a6xx state memory · d6852b4b

Jordan Crouse authored Nov 02, 2018



The a6xx GPU state allocates a LOT of memory. Add a bit of
infrastructure to track the memory allocations in the GPU structure
and delete them when the state is destroyed much the same way
that devm works with the device model as a whole.  This protects
against the developer accidentally forgetting to add a kfree() to
an ever growing list.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

d6852b4b

drm/msm/a6xx: Add a6xx gpu state · 1707add8

Jordan Crouse authored Nov 02, 2018



Add support for gathering and dumping the a6xx GPU state including
registers, GMU registers, indexed registers, shader blocks,
context clusters and debugbus.

v2: Fix bugs discovered by Sharat Masetty

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

1707add8

drm/msm/adreno: Don't capture register values if target doesn't define them · b9fc2302

Jordan Crouse authored Nov 02, 2018



If the GPU target doesn't define a list of registers then gracefully skip
capturing and/or printing them. This is used by more complex targets like
6xx that have other means of capturing register values.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

b9fc2302

drm/msm/gpu: Move gpu_poll_timeout() to adreno_gpu.h · 983674e2

Jordan Crouse authored Nov 02, 2018



The gpu_poll_timeout() function can be useful to multiple targets so
mvoe it into adreno_gpu.h from the a5xx code.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

983674e2

drm/msm/gpu: Add trace events for tracking GPU submissions · 4241db42

Jordan Crouse authored Nov 02, 2018



Add trace events to track the progress of a GPU submission
msm_gpu_submit occurs at the beginning of the submissions,
msm_gpu_submit_flush happens when the submission is put on
the ringbuffer and msm_submit_flush_retired is sent when
the operation is retired.

To make it easier to track the operations a unique sequence
number is assigned to each submission and displayed in each
event output so a human or a script can easily associate
the events related to a specific submission.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

4241db42

drm/msm/gpu: Add per-submission statistics · 56869210

Jordan Crouse authored Nov 02, 2018



Add infrastructure to track statistics for GPU submissions
by sampling certain perfcounters before and after a submission.

To store the statistics, the per-ring memptrs region is
expanded to include room for up to 64 entries - this should
cover a reasonable amount of inflight submissions without
worrying about losing data. The target specific code inserts
PM4 commands to sample the counters before and after
submission and store them in the data region. The CPU can
access the data after the submission retires to make sense
of the statistics and communicate them to the user.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

56869210

drm/msm: update generated headers · ccdf7e28
Rob Clark authored Dec 02, 2018
```
Signed-off-by: Rob Clark <robdclark@gmail.com>
```
ccdf7e28

drm: msm: Use DRM_DEV_* instead of dev_* · 6a41da17

Mamta Shukla authored Oct 20, 2018



Use DRM_DEV_INFO/ERROR/WARN instead of dev_info/err/debug to generate
drm-formatted specific log messages so that it will be easy to
differentiate in case of multiple instances of driver.

Signed-off-by: Mamta Shukla <mamtashukla555@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>

6a41da17

Oct 23, 2018

drm/msm: fix OF child-node lookup · f9a70823

Johan Hovold authored Aug 27, 2018



Use the new of_get_compatible_child() helper to lookup the legacy
pwrlevels child node instead of using of_find_compatible_node(), which
searches the entire tree from a given start node and thus can return an
unrelated (i.e.  non-child) node.

This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the probed device's node).

While at it, also fix the related child-node reference leak.

Fixes: e2af8b6b ("drm/msm: gpu: Use OPP tables if we can")
Cc: stable <stable@vger.kernel.org>     # 4.12
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>

f9a70823

drm: msm: adreno: Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) +PTR_ERR · c97ea6a6

Mamta Shukla authored Oct 19, 2018

Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR because its
better to have inlined function rather than code-opened implementation.

Signed-off-by: Mamta Shukla <mamtashukla555@gmail.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181018204815.GA23390@armorer

c97ea6a6

Oct 09, 2018

drm/msm: a6xx: Fix improper u64 division · 16f37102

Sean Paul authored Oct 08, 2018



This patch uses the proper do_div() macro to perform u64 division and
guards against overflow if the result is too large for the unsigned long
return type

Fixes: a2c3c0a5 drm/msm/a6xx: Add devfreq support for a6xx
Cc: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

16f37102

drm/msm: a5xx: Remove unneeded parens · 0f542721

Sean Paul authored Oct 08, 2018

A small fixup I posted with my v2 patch [1] that was dropped.

[1]- https://lists.freedesktop.org/archives/freedreno/2018-October/003647.html



Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

0f542721

Oct 07, 2018

drm/msm/a6xx: Remove CP perfcounter selects from the protected list · 3ce36b45

Jordan Crouse authored Oct 05, 2018



The CP performance counter selects were accidentally marked as protected
so they couldn't be written from PM4 streams. Remove the protection
because user space does have an interest in setting up their own
counters.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

3ce36b45

drm/msm: a5xx: Fix improper u64 division · f926a2e1

Sean Paul authored Oct 04, 2018



This patch uses the proper do_div() macro to perform u64 division and
guards against overflow if the result is too large for the unsigned long
return type

Fixes: de0a3d09 drm/msm: re-factor devfreq code
Cc: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

f926a2e1

drm/msm: update generated headers · a69c5ed2
Rob Clark authored Oct 04, 2018
```
Signed-off-by: Rob Clark <robdclark@gmail.com>
```
a69c5ed2

Oct 04, 2018

drm/msm/a6xx: Add devfreq support for a6xx · a2c3c0a5

Sharat Masetty authored Oct 04, 2018



Implement routines to estimate GPU busy time and fetching the
current frequency for the polling interval. This is required by
the devfreq framework which recommends a frequency change if needed.
The driver code then tries to set this new frequency on the GPU by
sending an Out Of Band(OOB) request to the GMU.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

a2c3c0a5

drm/msm: re-factor devfreq code · de0a3d09

Sharat Masetty authored Oct 04, 2018

The devfreq framework requires the drivers to provide busy time estimations.
The GPU driver relies on the hardware performance counteres for the busy time
estimations, but different hardware revisions have counters which can be
sourced from different clocks. So the busy time estimation will be target
dependent. Additionally on targets where the clocks are completely controlled
by the on chip microcontroller, fetching and setting the current GPU frequency
will be different. This patch aims to embrace these differences by re-factoring
the devfreq code a bit.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

de0a3d09

drm/msm/a6xx: Add gmu_read64() register read op · c28aa203

Sharat Masetty authored Oct 04, 2018



Add a simple function to read 64 registers in the GMU domain

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

c28aa203

drm/msm/a6xx: Add inactive_period for a6xx · bdacdcf2

Jordan Crouse authored Sep 28, 2018



The target definition for a630 didn't set a reasonable
value for inactive_period so it defaulted to zero and
we were essentially powering down after every submission.
Set it back to the default value to keep the GPU from
bouncing too much during regular workloads.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

bdacdcf2

drm/msm: Replace drm_gem_object_{un/reference} with put, get functions · 64686886

Thomas Zimmermann authored Sep 26, 2018



This patch unifies the naming of DRM functions for reference counting
of struct drm_gem_object. The resulting code is more aligned with the
rest of the Linux kernel interfaces.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

64686886

drm/msm/a6xx: Poll for HFI responses · df0dff13

Jordan Crouse authored Sep 20, 2018



The only HFI communication with the GMU on sdm845 happens
during initialization and all commands are synchronous. A fancy
interrupt tasklet and associated infrastructure is entirely
not eeded and puts us at the mercy of the scheduler.

Instead poll for the message signal and handle the response
immediately and go on our way.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

df0dff13

msm/gpu/a6xx: Force of_dma_configure to setup DMA for GMU · 32aa27e1

Jordan Crouse authored Sep 14, 2018



The point of the 'force_dma' parameter for of_dma_configure
is to force the device to be set up even if DMA capability is
not described by the firmware which is exactly the use case
 we have for GMU - we need SMMU to get set up but we have no
other dma capabilities since memory is managed by the GPU
driver. Currently we pass false so of_dma_configure() fails
and subsequently GMU and GPU probe does as well.

Fixes: 4b565ca5 ("drm/msm: Add A6XX device support")
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Tested-by: Sibi Sankar <sibis@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

32aa27e1

drm/msm/a5xx: Skip hardware preemption init if no preemption · fc6510ac

Sharat Masetty authored Sep 14, 2018



In the case where preemption is not enabled, this patch simply skips
preemption related initialization in hardware init sequence.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

fc6510ac

drm/msm/a6xx: Fix PDC register overlap · f8fc924e

Jordan Crouse authored Aug 08, 2018



The current design greedily takes a big chunk of the PDC
register space instead of just the GPU specific sections
which conflicts with other drivers and generally makes
a mess of things.

Furthermore we only need to map the GPU PDC sections
just once during init so map the memory inside the function
that uses it and adjust the pointers and register offsets
accordingly.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

f8fc924e

drm/msm/a6xx: Rename gmu phandle to qcom,gmu · 06feed56

Jordan Crouse authored Aug 08, 2018



>From the review for the DT bindings for the GPU/GMU it
was suggested that the phandle for the GMU be
'qcom,gmu' instead of just 'gmu'.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

06feed56

drm/msm/a6xx: Send the right perf index value to GMU · 9fb4bfd0

Sharat Masetty authored Sep 27, 2018



The index of the perf table was being set in the wrong bit position
in the register. With this fix, the GPU clock can be seen running at
desired frequency.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

9fb4bfd0

Aug 10, 2018

drm/msm: a6xx: fix spelling mistake: "initalization" -> "initialization" · 546907de

Colin Ian King authored Aug 08, 2018



Trivial fix to spelling mistake in dev_err message and comment

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>

546907de

drm/msm: Add A6XX device support · 4b565ca5

Jordan Crouse authored Aug 06, 2018



Add support for the A6XX family of Adreno GPUs. The biggest addition
is the GMU (Graphics Management Unit) which takes over most of the
power management of the GPU itself but in a ironic twist of fate
needs a goodly amount of management itself. Add support for the
A6XX core code, the GMU and the HFI (hardware firmware interface)
queue that the CPU uses to communicate with the GMU.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

4b565ca5

drm/msm: update generated headers · 2d756322

Rob Clark authored Aug 06, 2018



Resync generated headers to pull in a6xx registers.

Signed-off-by: Rob Clark <robdclark@gmail.com>

2d756322

drm/msm/adreno: Load the firmware before bringing up the hardware · 2c087a33

Jordan Crouse authored Aug 06, 2018



Failure to load firmware is the primary reason to fail adreno_load_gpu().
Try to load it first before going into the hardware initialization code and
unwinding it. This is important for a6xx because the GMU gets loaded from
the runtime power code and it is more costly to fail in that path because
of missing firmware.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

2c087a33

Aug 05, 2018

drm/msm/adreno: Remove VLA usage · bec2dd69

Kees Cook authored Jun 29, 2018

In the quest to remove all stack VLA usage from the kernel[1], this
switches to using a kasprintf()ed buffer. Return paths are updated
to free the allocation.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com



Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

bec2dd69

Jul 30, 2018

drm/msm/gpu: avoid deprecated do_gettimeofday · 3530a17f

Arnd Bergmann authored Jul 26, 2018



All users of do_gettimeofday() have been removed, but this one recently
crept in, along with an incorrect printing of the microseconds portion.

This converts it to using ktime_get_real_timespec64() as a direct
replacement, and adds the leading zeroes. I considered using monotonic
times (ktime_get()) instead, but as this timestamp appears to only
be used for humans rather than compared with other timestamps, the
real time domain is probably good enough.

Fixes: e43b045e2c82 ("drm/msm/gpu: Capture the state of the GPU")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>

3530a17f

drm/msm/gpu: Add the buffer objects from the submit to the crash dump · cdb95931

Jordan Crouse authored Jul 24, 2018



For hangs, dump copy out the contents of the buffer objects attached to the
guilty submission and print them in the crash dump report.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

cdb95931