sungmg/mutter-performance-source

Author	SHA1	Message	Date
Daniel van Vugt	29e7e4990c	clutter/frame-clock: Optimize latency for platforms missing TIMESTAMP_QUERY Previously if we had no measurements then `compute_max_render_time_us` would pessimise its answer to ensure triple buffering could be reached: ``` if (frame_clock->state == CLUTTER_FRAME_CLOCK_STATE_DISPATCHED_ONE) ret += refresh_interval_us; ``` But that also meant entering triple buffering even when not required. Now we make `compute_max_render_time_us` more honest and return failure if the answer isn't known (or is disabled). This in turn allows us to optimize `calculate_next_update_time_us` for this special case, ensuring triple buffering can be used, but isn't blindly always used. This makes a visible difference to the latency when dragging windows in Xorg, but will also help Wayland sessions on platforms lacking TIMESTAMP_QUERY such as Raspberry Pi. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:19 +09:00
Daniel van Vugt	2d7d3999cc	clutter/frame-clock: Record measurements of zero for cursor-only updates But only if we've ever got actual swap measurements (COGL_FEATURE_ID_TIMESTAMP_QUERY). If it's supported then we now drop to double buffering and get optimal latency on a burst of cursor-only updates. Closes: https://launchpad.net/bugs/2023363 Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:19 +09:00
Daniel van Vugt	5901e0d170	onscreen/native: Avoid callbacks on "detached" onscreens Detached onscreens have no valid view so avoid servicing callbacks on them during/after sleep mode. As previously mentioned in `45bda2d969`. Closes: https://launchpad.net/bugs/2020049 Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:19 +09:00
Daniel van Vugt	35fbafab53	tests/native-kms-render: Fix failing client-scanout test It was assuming an immediate transition from compositing (triple buffering) to direct scanout (double buffering), whereas there is a one frame delay in that transition as the buffer queue shrinks. We don't lose any frames in the transition. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	ac3b696448	clutter/frame-clock: Conditionally disable triple buffering 1. When direct scanout is attempted There's no compositing during direct scanout so the "render" time is zero. Thus there is no need to implement triple buffering for direct scanouts. Stick to double buffering and enjoy the lower latency. 2. If disabled by environment variable MUTTER_DEBUG_TRIPLE_BUFFERING With possible values {never, auto, always} where auto is the default. 3. When VRR is in use VRR calls `clutter_frame_clock_schedule_update_now` which would keep the buffer queue full, which in turn prevented direct scanout mode. Because OnscreenNative currently only supports direct scanout with double buffering. We now break that feedback loop by preventing triple buffering from being scheduled when the frame clock mode becomes variable. Long term this could also be solved by supporting triple buffering in direct scanout mode. But whether or not that would be desirable given the latency penalty remains to be seen. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	0ee1a3953c	clutter: Pass ClutterFrameHint(s) to the frame clock Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	6b62ab7fbc	backends: Flag that the frame attempted direct scanout We need this hint whether direct scanout succeeds or fails because it's the mechanism by which we will tell the clock to enforce double buffering, thus making direct scanout possible on future frames. Triple buffering will be disabled until such time that direct scanout is not being attempted. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	8eff581c4b	clutter/frame: Add ClutterFrameHint to ClutterFrame This will allow the backend to provide performance hints to the frame clock in future. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	ee468ad8a9	clutter/frame-clock: Log N-buffers in CLUTTTER_DEBUG=frame-timings Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	5e87a856fd	clutter/frame-clock: Add triple buffering support Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	bd4b3178cb	clutter/frame-clock: Merge states DISPATCHING and PENDING_PRESENTED Chronologically they already overlap in time as presentation may complete in the middle of the dispatch function, otherwise they are contiguous in time. And most switch statements treated the two states the same already so they're easy to merge into a single `DISPATCHED` state. Having fewer states now will make life easier when we add more states later. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:18 +09:00
Daniel van Vugt	d5074791e4	clutter/frame-clock: Lower the threshold for disabling error diffusion Error diffusion was introduced in `0555a5bbc1` for Nvidia where last presentation time is always unknown (zero). Dispatch times would drift apart always being a fraction of a frame late, and accumulated to cause periodic frame skips. So error diffusion corrected that precisely and avoided the skips. That works great with double buffering but less great with triple buffering. It's certainly still needed with triple buffering but correcting for a lateness of many milliseconds isn't a good idea. That's because a dispatch being that late is not due to main loop jitter but due to Nvidia's swap buffers blocking when the queue is full. So scheduling the next frame even earlier using last_dispatch_lateness_us would just perpetuate the problem of swap buffers blocking for too long. So now we lower the threshold of when error diffusion gets disabled. It's still high enough to fix the original smoothness problem it was for, but now low enough to detect Nvidia's occasionally blocking swaps and backs off in that case. Since the average duration of a blocking swap is half a frame interval and we want to distinguish between that and sub-millisecond jitter, the logical threshold is halfway again: refresh_interval_us/4. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	1202195313	renderer/native: Discard pending swaps when rebuilding views It's analogous to discard_pending_page_flips but represents swaps that might become flips after the next frame notification callbacks, thanks to triple buffering. Since the views are being rebuilt and their onscreens are about to be destroyed, turning those swaps into more flips/posts would just lead to unexpected behaviour (like trying to flip on a half-destroyed inactive CRTC). Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	c5da041302	onscreen/native: Skip try_post_latest_swap if shutting down Otherwise we could get: meta_kms_prepare_shutdown -> flush_callbacks -> ... -> try_post_latest_swap -> post and queue more callbacks So later in shutdown those callbacks would trigger an assertion failure in meta_kms_impl_device_atomic_finalize: g_hash_table_size (impl_device_atomic->page_flip_datas) == 0 Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	1ae03b74a4	onscreen/native: Add function meta_onscreen_native_discard_pending_swaps Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	77f5eff164	onscreen/native: Increase secondary GPU dumb_fbs from 2 to 3 So that they don't get overwritten prematurely during triple buffering causing tearing. https://launchpad.net/bugs/1999216 Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	4d5b0277a2	onscreen/native: Defer posting if there's already a post in progress And when the number of pending posts decreases we know it's safe to submit a new one. Since KMS generally only supports one outstanding post right now, "decreases" means equal to zero. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	9acc23676e	onscreen/native: Insert a 'posted' frame between 'next' and 'presented' This will allow us to keep track of up to two buffers that have been swapped but not yet scanning out, for triple buffering. This commit replaces mutter!1968 Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	4399f97312	onscreen/native: Split swap_buffers_with_damage into two functions 1. The EGL part: meta_onscreen_native_swap_buffers_with_damage 2. The KMS part: post_latest_swap Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	4e094888c7	onscreen/native: Deduplicate calls to clutter_frame_set_result All paths out of `meta_onscreen_native_swap_buffers_with_damage` from here onward would set the same `CLUTTER_FRAME_RESULT_PENDING_PRESENTED` (or terminate with `g_assert_not_reached`). Even failed posts set this result because they will do a `meta_onscreen_native_notify_frame_complete` in `page_flip_feedback_discarded`. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:17 +09:00
Daniel van Vugt	ed4bf1e165	onscreen/native: Replace an assertion that double buffering is the maximum Because it soon won't be the maximum. But we do want to verify that the frame info queue is not empty, to avoid NULL dereferencing and catch logic errors. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	4d838f348c	onscreen/native: Log swapbuffers and N-buffering when MUTTER_DEBUG=kms Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	5a1b32f6fd	backends/native: Add set/get_damage functions to MetaFrameNative Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	e5374ba456	renderer/native: Steal the power save flip list before iterating over it Because a single iteration might also grow the list again. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	19ffbaaa2a	renderer/native: Avoid requeuing the same onscreen for a power save flip This is a case that triple buffering will encounter. We don't want it to queue the same onscreen multiple times because that would represent multiple flips occurring simultaneously. It's a linear search but the list length is typically only 1 or 2 so no need for anything fancier yet. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	c2da109fdc	kms: Keep a shutting_down flag Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	fc7ece8a13	cogl/onscreen: Indent declaration parameters to align with above This fixes warnings from check-code-style. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:16 +09:00
Daniel van Vugt	0b33663007	cogl/onscreen: Add function cogl_onscreen_get_pending_frame_count Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 14:00:15 +09:00
Gert-dev	2cdcc4b9cc	onscreen/native: Use EGLSyncs instead of cogl_framebuffer_finish cogl_framebuffer_finish can result in a CPU-side stall because it waits for the primary GPU to flush and execute all commands that were queued before that. By using a GPU-side EGLSync we can let the primary GPU inform us when it is done with the queued commands instead. We then create another EGLSync on the secondary GPU using the same fd so the primary GPU effectively signals the secondary GPU when it is done rendering, causing the latter to wait for the former before copying part of the frames it needs for monitors attached to it directly. This solves the corruption that cogl_framebuffer_finish also solved, but without needing a CPU-side stall. Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:59:55 +09:00
Michel Dänzer	1444e82cd8	onscreen/native: Set latest cogl sync_fd on KMS update See previous commit log on the effects of this. This means the deadline evasion needs to be added in both cases in clutter_frame_clock_notify_presented. v2: * Use meta_kms_update_set_sync_fd. (Jonas Ådahl) Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3958> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:59:32 +09:00
Michel Dänzer	a9a221933e	kms/impl-device: Handle sync_fd in meta_kms_impl_device_handle_update If the KMS thread is using the deadline timer, and a valid sync_file descriptor is passed in: 1. The update is deferred, and the deadline timer is left armed, until the sync_fd signals (becomes readable). 2. Implicit synchronization is disabled for the KMS update. This means cursor updates should no longer miss a display refresh cycle due to mutter's compositing GPU work finishing too late. v2: * Use g_autoptr for GSource in meta_kms_impl_device_handle_update. (Sebastian Wick) v3: * Use meta_kms_update_get_sync_fd, don't track sync_fd in CrtcFrame::submitted_update. (Jonas Ådahl) v4: * Clean up CrtcFrame::submitted_update members in crtc_frame_free. v5: * Coding style cleanup in meta_kms_impl_device_handle_update. (Jonas Ådahl) Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3958> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:59:32 +09:00
Michel Dänzer	168839e317	kms/update: Add meta_kms_update_get/set_sync_fd v2: * Use g_steal_fd in meta_kms_update_merge_from. (Jonas Ådahl) Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3958> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:59:32 +09:00
Michel Dänzer	011b6f7c96	kms/plane: Rename META_KMS_ASSIGN_PLANE_FLAG_DIRECT_SCANOUT To META_KMS_ASSIGN_PLANE_FLAG_DISABLE_IMPLICIT_SYNC. This describes the effect of the flag, instead of the circumstances it's currently used for. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3958> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:59:32 +09:00
Michel Dänzer	a364e785f9	kms/crtc: Conditionally return 0 in meta_kms_crtc_get_deadline_evasion If both crtc->shortterm_max_dispatch_duration_us and crtc->deadline_evasion_us are 0, i.e. we're not using the deadline timer. v2: * Fix coding style. (Jonas Ådahl) Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3958> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:59:31 +09:00
Michel Dänzer	e3bbee2630	kms/impl-device: Track dispatch duration in crtc_frame_deadline_dispatch And take it into account in meta_kms_crtc_get_deadline_evasion. This uses the same fundamental approach as clutter frame clock scheduling: Measure the deadline timer dispatch duration, keep track of the longest duration, and set the timer to fire such that the longest measured dispatch duration would result in it completing shortly before start of vblank. Closes: https://gitlab.gnome.org/GNOME/mutter/-/issues/3612 v2: * Move DEADLINE_EVASION_CONSTANT_US addition from meta_kms_crtc_determine_deadline to meta_kms_crtc_get_deadline_evasion. * Calculate how long before start of vblank dispatch completed for debug output in crtc_frame_deadline_dispatch. * Shorten over-long lines in crtc_frame_deadline_dispatch. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3934> Signed-off-by: Mingi Sung <sungmg@saltyming.net> (cherry picked from commit 88e7f353)	2024-09-15 13:58:58 +09:00
Michel Dänzer	b5dffcdc67	kms/impl-device: Use KMS_DEADLINE in crtc_page_flip_feedback_flipped It's useful for this to match the debug topic in crtc_frame_deadline_dispatch. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3934> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:58 +09:00
Daniel van Vugt	0411de33b5	kms/impl-device: Add debug logging for deadline dispatch lateness And also "completion" time to measure when the commit returned. This is structured so as to measure all timestamps first before logging anything. That way our results shouldn't be (don't seem to be) affected by the logging itself. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3265> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:58 +09:00
Daniel van Vugt	739ad5590b	kms/impl-device: Remember the expected deadline dispatch time Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3265> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:58 +09:00
Daniel van Vugt	7f88fd419b	Add debug topic "kms-deadline" Which will allow us to report on deadline timings without influencing the CPU clock like the busy "kms" topic does. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3265> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:58 +09:00
Jonas Ådahl	c9bb8a16a2	kms: Don't disarm deadline timer when compositing If we finish compositing in time, the composited result will be submitted prior to the deadline timer is triggered, and we'll be fine, and if not, at least the cursor updates will be smooth, which makes it appear smoother than not. There is a risk that this can negatively impact composited updates when moving the cursor, so make it possible to toggle a paint-debug flag for now until this has been more tested. This also mean we need to disarm the deadline timer after handling update, as there might be a scheduled cursor update pending, but we already handled it, so disarm the timer. Here is an illustration of the difference. In the following scenario, with disarming, the composited frame E, and the cursor movement C gets presented. With this branch, only the cursor movement C gets presented. ``` * A: beginning of composited frame * B: begin notification reaches KMS thread * C: cursor moved * D: calculated deadline dispatch time (disabled with the branch) * E: KMS update posted * F: KMS update reaches KMS thread * G: actual deadline (and with branch and gets committed) Compositor thread: --------A---------------E--------- \ \ \ \ KMS thread: -----------B------C----D---F-G---- ``` In the following scenario, by not disarming, the cursor update C will be presented, and the would-be-delayed composited frame E would be delayed anyway, i.e. fixing cursor stutter. ``` * A: beginning of composited frame * B: begin notification reaches KMS thread * C: cursor moved * D: calculated deadline dispatch time (and with branch will be dispatched) * E: KMS update posted * F: actual deadline * G: KMS update reaches KMS thread (and with branch gets postponed) Compositor thread: --------A---------------E--------- \ \ \ \ KMS thread: -----------B------C----D-F-G------ ``` Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3184> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:58 +09:00
Jonas Ådahl	efadfc4a94	renderer-view/native: Update deadline evasion each frame The deadline evasion depends on debug flags, but they are not trackable, so update the deadline evasion each time we schedule an update. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3184> Signed-off-by: Mingi Sung <sungmg@saltyming.net> (cherry picked from commit `6ec1312384`)	2024-09-15 13:58:57 +09:00
Jonas Ådahl	96ca767e22	clutter/frame-clock: Take deadline evasion into account This is meant to be the amount of time before a CRTC deadline we're usually dispatching at. It's not yet set by anything however. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3184> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:57 +09:00
Daniel van Vugt	9e07b3be72	onscreen/native: Return GErrors from secondary GPU updates And return early from `swap_buffers_with_damage` if the error would have led to flipping a NULL buffer. This is also the perfect time to remove the `egl_context_changed` parameter and move `_cogl_winsys_egl_ensure_current` closer to the code that actually needs it. Related: https://bugs.launchpad.net/bugs/2069565 Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3817> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:34 +09:00
Daniel van Vugt	99bbd37b02	onscreen/native: Set frame result to IDLE on swap failure So that swap failure messages are not also followed by: meta_stage_native_redraw_view: runtime check failed: (!META_IS_CRTC_KMS (crtc)) Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3817> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:34 +09:00
Daniel van Vugt	608f6d1223	onscreen/native: Unify the failure paths of swap_buffers_with_damage They're both the same and a third one will be added soon. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3817> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:34 +09:00
Daniel van Vugt	3214e92918	onscreen/native: Squash adjacent switch statements Because we can. And it's now clearer that `buffer` is only used in `META_RENDERER_NATIVE_MODE_GBM`. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3891> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:34 +09:00
Daniel van Vugt	bc74aadcc2	onscreen/native: Move next_frame storage to later in the function It won't be used until later when we flip, and in fact assigning it early could have led to its own assertion failing on the next frame in the unlikely event that we return with "Failed to ensure KMS FB ID... Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3891> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:33 +09:00
Daniel van Vugt	1815af679f	onscreen/native: Return the framebuffer by result, not parameters `update_secondary_gpu_state_post_swap_buffers` decides what our front buffer object will be. There is only one answer. So return it as the function result instead of making the caller figure it out. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3830> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:33 +09:00
Daniel van Vugt	e53f0e1463	onscreen/native: Remove frame parameter from flip_crtc It's always equal to `onscreen_native->next_frame` and we can't eliminate that copy so easily. Removing the parameter removes all ambiguity about where the next frame will come from. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3829> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:33 +09:00
Jonas Ådahl	f45d4c0c7f	onscreen/native: Track next and presenting buffers via ClutterFrame Let the ClutterFrame (or rather MetaFrameNative) own both the scanout object and the framebuffer object, and let the frame itself live for as long as it's needed. This allows to place fields that is related to a single frame together, aiming to help reasoning about the lifetime of the fields that were previously directly stored in MetaOnscreenNative. Also take the opportunity to rename "current" to "presenting", to make it clearer that frame's buffer is what is currently presenting to the user. Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3799> Signed-off-by: Mingi Sung <sungmg@saltyming.net>	2024-09-15 13:58:29 +09:00

1 2 3 4 5 ...

32018 commits