Max render time shows how early the frame clock needs to be dispatched
to make it to the predicted next presentation time. Before this commit
it was set to refresh interval minus 2 ms. This means Mutter would
always start compositing 14.7 ms before a display refresh on a 60 Hz
screen or 4.9 ms before a display refresh on a 144 Hz screen. However,
Mutter frequently does not need as much time to finish compositing and
submit buffer to KMS:
max render time
/------------\
---|---------------|---------------|---> presentations
D----S D--S
D - frame clock dispatch
S - buffer submission
This commit aims to automatically compute a shorter max render time to
make Mutter start compositing as late as possible (but still making it
in time for the presentation):
max render time
/-----\
---|---------------|---------------|---> presentations
D----S D--S
Why is this better? First of all, Mutter gets application contents to
draw at the time when compositing starts. If new application buffer
arrives after the compositing has started, but before the next
presentation, it won't make it on screen:
---|---------------|---------------|---> presentations
D----S D--S
A-------------X----------->
^ doesn't make it for this presentation
A - application buffer commit
X - application buffer sampled by Mutter
Here the application committed just a few ms too late and didn't make on
screen until the next presentation. If compositing starts later in the
frame cycle, applications can commit buffers closer to the presentation.
These buffers will be more up-to-date thereby reducing input latency.
---|---------------|---------------|---> presentations
D----S D--S
A----X---->
^ made it!
Moreover, applications are recommended to render their frames on frame
callbacks, which Mutter sends right after compositing is done. Since
this commit delays the compositing, it also reduces the latency for
applications drawing on frame callbacks. Compare:
---|---------------|---------------|---> presentations
D----S D--S
F--A-------X----------->
\____________________/
latency
---|---------------|---------------|---> presentations
D----S D--S
F--A-------X---->
\_____________/
less latency
F - frame callback received, application starts rendering
So how do we actually estimate max render time? We want it to be as low
as possible, but still large enough so as not to miss any frames by
accident:
max render time
/-----\
---|---------------|---------------|---> presentations
D------S------------->
oops, took a little too long
For a successful presentation, the frame needs to be submitted to KMS
and the GPU work must be completed before the vblank. This deadline can
be computed by subtracting the vblank duration (calculated from display
mode) from the predicted next presentation time.
We don't know how long compositing will take, and we also don't know how
long the GPU work will take, since clients can submit buffers with
unfinished GPU work. So we measure and estimate these values.
The frame clock dispatch can be split into two phases:
1. From start of the dispatch to all GPU commands being submitted (but
not finished)—until the call to eglSwapBuffers().
2. From eglSwapBuffers() to submitting the buffer to KMS and to GPU
work completing. These happen in parallel, and we want the latest of
the two to be done before the vblank.
We measure these three durations and store them for the last 16 frames.
The estimate for each duration is a maximum of these last 16 durations.
Usually even taking just the last frame's durations as the estimates
works well enough, but I found that screen-capturing with OBS Studio
increases duration variability enough to cause frequent missed frames
when using that method. Taking a maximum of the last 16 frames smoothes
out this variability.
The durations are naturally quite variable and the estimates aren't
perfect. To take this into account, an additional constant 2 ms is added
to the max render time.
How does it perform in practice? On my desktop with 144 Hz monitors I
get a max render time of 4–5 ms instead of the default 4.9 ms (I had
1 ms manually configured in sway) and on my laptop with a 60 Hz screen I
get a max render time of 4.8–5.5 ms instead of the default 14.7 ms (I
had 5–6 ms manually configured in sway). Weston [1] went with a 7 ms
default.
The main downside is that if there's a sudden heavy batch of work in the
compositing, which would've made it in default 14.7 ms, but doesn't make
it in reduced 6 ms, there is a delayed frame which would otherwise not
be there. Arguably, this happens rarely enough to be a good trade-off
for reduced latency. One possible solution is a "next frame is expected
to be heavy" function which manually increases max render time for the
next frame. This would avoid this single dropped frame at the start of
complex animations.
[1]: https://www.collabora.com/about-us/blog/2015/02/12/weston-repaint-scheduling/
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1762>
Scanout doesn't go through the usual path of compositing and doing
eglSwapBuffers, therefore it doesn't hit the timestamp query placed in
that path. Instead, get the timings by binding the scanout buffer to an
FBO and doing a timestamp query on the FBO.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1762>
Add utilities that allow getting the current GPU timestamp and creating
a query which completes upon completion of all operations currently
submitted on a framebuffer. Combined, these two allow measuring how long
it took the GPU to finish rendering something to a framebuffer.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1762>
This drops some custom building of various components that are now up to
date. While at it, start using the FDO_DISTRIBUTION_PACKAGES variable to
install packages, as it with the bumped ci-templates version also
doesn't install weak dependencies.
This also requires tweaking the pipewire dead lock work arounds, as it
changed configuration file paths.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1865>
This fixes a warning/error:
In function 'parse_settings',
inlined from 'read_settings' at ../clutter/clutter/x11/xsettings/xsettings-client.c:398:25:
../clutter/clutter/x11/xsettings/xsettings-client.c:202:13: error: 'buffer.byte_order' may be used uninitialized [-Werror=maybe-uninitialized]
202 | if (buffer.byte_order != MSBFirst &&
| ~~~~~~^~~~~~~~~~~
This is needed to bump the CI image from F33 to F34, which includes a
upgraded compiler.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1865>
In order to make it possible to e.g. unload an unused DRM device, we
need to make sure that we don't keep the file descriptor open if we
don't need it; otherwise we block anyone from unloading the
corresponding module.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
We need to call eglBindAPI() with GLES before we setup the secondary
GPU blit. We've been lucky not really needing this, as it has been
GLES default, which is what the secondary blit uses, in order to not
depend on the default, or if we want to create the secondary blit
objects after initializing cogl, we must make sure to bind the right API
at the right time.
As we need to bind the GLES API when setting up the secondary blit, we
need to make sure that cogl gets the right API bound when that's done,
so Cogl can continue working. For this, add a "bind_api()" method on the
CoglRenderer object, that will know what API is correct to bind.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
The DRM buffers aren't really tied to mode setting, so they shouldn't
need to have an associated mode setting device. Now that we have a
device file level object that can fill this role, port over
MetaDrmBuffer and friends away from MetaKmsDevice to MetaDeviceFile.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
Keep a private MetaDeviceFile instance for the GPU's managed by the
renderer. This is a step towards decoupling rendering from mode setting,
as well as on-demand holding of device file descriptors.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
Tags are meant to make it possible for a device file opener to tag a
file if it has affected the state the file descriptor is in; e.g. if it
has enabled a DRM capability.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
Handle open() failing due to being interrupted by trying again until it
either succeeds, or fails due to some other error. This was an error
handling path taken when opening sysfs files; do the same here to not
potentially regress once we open sysfs files with the device pool.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
It's only when we take/release from/to logind we need these two
integers, so only retrieve them when that's done. Making this change
makes it possible to open devices that don't have these parameters.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>
This changes the way the KMS backends load; if we're headless, we always
use the dummy one and fail otherwise; in other cases, we first try the
atomic backend, and if that fails, fall back on the simple one.
The aim for this is to have the impl device open and close the device
when needed, using the device pool directly.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1828>