[PR #12552] AtlasEngine: Reduce shader power draw with explicit branching #29109

Closed
opened 2026-01-31 09:32:51 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/12552

State: closed
Merged: Yes


Many articles I read while writing this engine claimed that GPUs can't
do branches like CPUs can. One common approach to branching in GPUs is
apparently to "mask" out results, a technique called branch predication.
The GPU will simply execute all instructions in your shader linearly,
but if a branch isn't taken, it'll ignore the computation results.
This is unfortunate for our shader, since most branches we have are
only very seldomly taken. The cursor for instance is only drawn
on a single cell and underlines are seldomly used.

But apparently modern GPUs (2010s and later?) are actually entirely
capable of branching, if all lanes ("pixels") processed by a
wave (""GPU core"") take the same branch.

On both my Nvidia GPU (RTX 3080) and Intel iGPU (Intel HD Graphics 530)
this change has a positive impact on power draw. Most noticeably on the
latter this reduces power draw from 900mW down to 600mW at 60 FPS.

PR Checklist

  • I work here
  • Tests added/passed

Validation Steps Performed

It seems to work fine on Intel and Nvidia GPUs.
Unfortunately I don't have a AMD GPU to test this on, but I suspect it can't be worse.

**Original Pull Request:** https://github.com/microsoft/terminal/pull/12552 **State:** closed **Merged:** Yes --- Many articles I read while writing this engine claimed that GPUs can't do branches like CPUs can. One common approach to branching in GPUs is apparently to "mask" out results, a technique called branch predication. The GPU will simply execute all instructions in your shader linearly, but if a branch isn't taken, it'll ignore the computation results. This is unfortunate for our shader, since most branches we have are only very seldomly taken. The cursor for instance is only drawn on a single cell and underlines are seldomly used. But apparently modern GPUs (2010s and later?) are actually entirely capable of branching, _if_ all lanes ("pixels") processed by a wave (""GPU core"") take the same branch. On both my Nvidia GPU (RTX 3080) and Intel iGPU (Intel HD Graphics 530) this change has a positive impact on power draw. Most noticeably on the latter this reduces power draw from 900mW down to 600mW at 60 FPS. ## PR Checklist * [x] I work here * [x] Tests added/passed ## Validation Steps Performed It seems to work fine on Intel and Nvidia GPUs. Unfortunately I don't have a AMD GPU to test this on, but I suspect it can't be worse.
claunia added the pull-request label 2026-01-31 09:32:51 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#29109