Extreme sluggishness and CPU usage when using Atlas inside a VM #17462

Closed
opened 2026-01-31 05:43:12 +00:00 by claunia · 8 comments
Owner

Originally created by @jessey-git on GitHub (May 11, 2022).

Originally assigned to: @lhecker on GitHub.

Windows Terminal version

1.13.10983.0

Windows build number

10.0.19044.1645

Other Software

None

Steps to reproduce

  • Use Terminal inside a typical VM
  • Switch to the experimental Atlas engine
  • Observe CPU load
  • Try to type and experience sluggishness and even higher CPU load etc. There's a decent amount of lag and output from long commands tends to stutter into view; it's not very smooth

Expected Behavior

Was expecting that Atlas, in default/easy cases, would perform better than the old engine.

Looking ahead: IF atlas becomes the default and IF Terminal replaces all old cmd windows, then using this inside VMs is going to be a non-starter.

Actual Behavior

Extreme CPU usage and general sluggishness. Witness the firepower of this fully operational battle... I mean barely affordable Azure VM displaying nothing but the flashing cursor inside a blank Terminal window:

atlas-perf
atlas-dxdiag

Originally created by @jessey-git on GitHub (May 11, 2022). Originally assigned to: @lhecker on GitHub. ### Windows Terminal version 1.13.10983.0 ### Windows build number 10.0.19044.1645 ### Other Software None ### Steps to reproduce - Use Terminal inside a typical VM - Switch to the experimental Atlas engine - Observe CPU load - Try to type and experience sluggishness and even higher CPU load etc. There's a decent amount of lag and output from long commands tends to stutter into view; it's not very smooth ### Expected Behavior Was expecting that Atlas, in default/easy cases, would perform better than the old engine. Looking ahead: IF atlas becomes the default and IF Terminal replaces all old cmd windows, then using this inside VMs is going to be a non-starter. ### Actual Behavior Extreme CPU usage and general sluggishness. Witness the firepower of this fully operational battle... I mean barely affordable Azure VM displaying nothing but the flashing cursor inside a blank Terminal window: ![atlas-perf](https://user-images.githubusercontent.com/7989986/167744410-1576f18e-27e4-4b72-a4d4-457db9d059b9.png) ![atlas-dxdiag](https://user-images.githubusercontent.com/7989986/167744490-c0cf7d48-d8e6-4c6b-a441-7e1391e253b1.png)
Author
Owner

@DHowett commented on GitHub (May 11, 2022):

Hmmmm....

I wonder if this is because we don't do differential rendering with the Atlas engine, so we're making the RDC server diff and compress at >=60fps.

@lhecker One of the main reasons we wanted to use Present1 with dirty regions was to give the graphics stack--including the remote desktop server--as much information as it needs to be successful. Is that totally off the table for Atlas? Over remote desktop, full region presentation might be so expensive that it's worthwhile for us to burn some CPU up front in Terminal.

Witness the firepower of this fully operational battle... I mean barely affordable Azure VM

😁

@DHowett commented on GitHub (May 11, 2022): Hmmmm.... I wonder if this is because we don't do differential rendering with the Atlas engine, so we're making the RDC server diff and compress at >=60fps. @lhecker One of the main reasons we wanted to use `Present1` with dirty regions was to give the graphics stack--including the remote desktop server--as much information as it needs to be successful. Is that totally off the table for Atlas? Over remote desktop, full region presentation might be so expensive that it's worthwhile for us to burn some CPU up front in Terminal. > Witness the firepower of this fully operational battle... I mean barely affordable Azure VM 😁
Author
Owner

@lhecker commented on GitHub (May 13, 2022):

I think this is 19041.1620.vb_release_svc_prod3.220311-1803.
I've created a similar VM using TDP and I have a RDP latency of >200ms:

image

image

I could not replicate the extreme sluggishness. The CPU load however is at around 14% compared to 4% for DxEngine. Additionally the perceived framerate while printing text is much better with DxEngine.

@lhecker commented on GitHub (May 13, 2022): I think this is `19041.1620.vb_release_svc_prod3.220311-1803`. I've created a similar VM using TDP and I have a RDP latency of >200ms: ![image](https://user-images.githubusercontent.com/2256941/168333956-da1d7e94-8c2d-4889-ba04-ee1342936ad9.png) ![image](https://user-images.githubusercontent.com/2256941/168334074-2f4ef8f5-8aca-43a7-9642-302646d386ad.png) I could not replicate the extreme sluggishness. The CPU load however is at around 14% compared to 4% for DxEngine. Additionally the perceived framerate while printing text is much better with DxEngine.
Author
Owner

@lhecker commented on GitHub (May 14, 2022):

I investigated this over the last few hours. My findings:

  • Scroll rects/offsets, etc. don't help
  • Device creation with D3D_DRIVER_TYPE_HARDWARE works, but there's no difference to D3D_DRIVER_TYPE_WARP, which indicates that there's no GPU. This makes sense, since it's a VM and Intel Xeon CPUs don't have iGPUs.
  • AtlasEngine uses "dependent texture lookups" which I believe is killing the VM. I modified the shader to just draw a screen full of the cursor texture and that uses 40% "GPU", whereas the full shader with twice as many texture loads uses exactly twice that (basically using 80% CPU while rendering). Direct2D sends text straight over RDP and renders it on the client side.

Proposed solutions:

  • Meet and greet with the RDP folks to figure out if there's maybe a solution, but probably not. Direct2D docs:

    Content that is rendered by using Direct2D can also be displayed remotely by using the Remote Desktop Protocol (RDP) infrastructure in the Windows 7 operating system.

  • Dynamic resolution scaling during scrolling: We simply draw 4x less pixels while the screen is scrolling and drop the GPU load from 80% down to 20%. Example: https://docs.microsoft.com/en-us/windows/uwp/gaming/multisampling--scaling--and-overlay-swap-chains
    Most importantly however, this doesn't really solve the lag during typing...
  • Implement a Direct2D-based engine inside AtlasEngine, which has worse support, but works well over RDP

In the meantime I believe that the AtlasEngine is unusable on Azure VMs without dedicated GPU.

@lhecker commented on GitHub (May 14, 2022): I investigated this over the last few hours. My findings: * Scroll rects/offsets, etc. don't help * Device creation with `D3D_DRIVER_TYPE_HARDWARE` works, but there's no difference to `D3D_DRIVER_TYPE_WARP`, which indicates that there's no GPU. This makes sense, since it's a VM and Intel Xeon CPUs don't have iGPUs. * `AtlasEngine` uses "dependent texture lookups" which I believe is killing the VM. I modified the shader to just draw a screen full of the cursor texture and that uses 40% "GPU", whereas the full shader with twice as many texture loads uses exactly twice that (basically using 80% CPU while rendering). Direct2D sends text straight over RDP and renders it on the client side. Proposed solutions: * Meet and greet with the RDP folks to figure out if there's _maybe_ a solution, but probably not. [Direct2D docs](https://docs.microsoft.com/en-us/windows/win32/direct2d/direct2d-overview#high-performance-with-maximum-availability): > Content that is rendered by using Direct2D can also be displayed remotely by using the Remote Desktop Protocol (RDP) infrastructure in the Windows 7 operating system. * Dynamic resolution scaling during scrolling: We simply draw 4x less pixels while the screen is scrolling and drop the GPU load from 80% down to 20%. Example: https://docs.microsoft.com/en-us/windows/uwp/gaming/multisampling--scaling--and-overlay-swap-chains Most importantly however, this doesn't really solve the lag during typing... * Implement a Direct2D-based engine inside AtlasEngine, which has worse support, but works well over RDP In the meantime I believe that the `AtlasEngine` is unusable on Azure VMs without dedicated GPU.
Author
Owner

@lhecker commented on GitHub (May 15, 2022):

@jessey-git Do you have VS Code installed in your VM? Does it's builtin terminal stutter just as bad for you? In my testing it does...
(Alternatively any other Terminal based on xterm.js, or an IDE made by Jetbrains, or any other software which isn't based on Direct2D.)


Some thoughts for the team:
We have this problem now, where the new renderer significantly improves performance on any GPU conceived in the last 10 years, but is significantly worse on any hardware older than that and those without a GPU at all (like your server), because they fall back to a pure software-GPU called WARP. How do we now reconcile the wishes for performance by the "many" who have long had such "new" hardware with the fundamental needs of a working terminal by the "few" who don't have such hardware? (Words in quotes are generalizations I'm making. I don't have the actual stats for these at hand. 😅)
My idea is that we could just implement a "poor man's DxEngine" inside AtlasEngine. One that supports all basic functionality, but is not the main-goal of development. It would work well on such problematic hardware, but I wouldn't necessarily implement double-height/width text or sixels, etc. Basically I'd focus on basic text rendering when no GPU is available only, with added support for all basic text attributes, color and selection. Thoughts?

@lhecker commented on GitHub (May 15, 2022): @jessey-git Do you have VS Code installed in your VM? Does it's builtin terminal stutter just as bad for you? In my testing it does... (Alternatively any other Terminal based on [xterm.js](https://xtermjs.org/), or an IDE made by Jetbrains, or any other software which isn't based on Direct2D.) --- Some thoughts for the team: We have this problem now, where the new renderer significantly improves performance on any GPU conceived in the last 10 years, but is significantly worse on any hardware older than that and those without a GPU at all (like your server), because they fall back to a pure software-GPU called WARP. How do we now reconcile the wishes for performance by the "many" who have long had such "new" hardware with the fundamental needs of a working terminal by the "few" who don't have such hardware? (Words in quotes are generalizations I'm making. I don't have the actual stats for these at hand. 😅) My idea is that we could just implement a "poor man's DxEngine" inside AtlasEngine. One that supports all basic functionality, but is not the main-goal of development. It would work well on such problematic hardware, but I wouldn't necessarily implement double-height/width text or sixels, etc. Basically I'd focus on basic text rendering when no GPU is available only, with added support for all basic text attributes, color and selection. Thoughts?
Author
Owner

@jessey-git commented on GitHub (May 16, 2022):

@lhecker Yeah, that sucks too. Here's a comparison of what it looks like when holding down a key between Terminal DxEngine and VS Code's integrated terminal window:

[Terminal DxEngine Holding key]
Terminal_DxEngine_HoldingKey

[VS Code Integrated Holding key - nearly 4x higher]
Code_Integrated_HoldingKey

@jessey-git commented on GitHub (May 16, 2022): @lhecker Yeah, that sucks too. Here's a comparison of what it looks like when holding down a key between Terminal DxEngine and VS Code's integrated terminal window: [Terminal DxEngine Holding key] ![Terminal_DxEngine_HoldingKey](https://user-images.githubusercontent.com/7989986/168497922-b074ea9e-eb48-462f-a37b-516b256ab954.png) [VS Code Integrated Holding key - nearly 4x higher] ![Code_Integrated_HoldingKey](https://user-images.githubusercontent.com/7989986/168497907-b28e1248-b371-4be6-ac53-3d638a397ada.png)
Author
Owner

@DHowett commented on GitHub (May 16, 2022):

My idea is that we could just implement a "poor man's DxEngine" inside AtlasEngine.

Honestly that just sounds like "keep the DxEngine" which is a lot easier ;)

@DHowett commented on GitHub (May 16, 2022): > My idea is that we could just implement a "poor man's DxEngine" inside AtlasEngine. Honestly that just sounds like "keep the DxEngine" which is a lot easier ;)
Author
Owner

@lhecker commented on GitHub (May 16, 2022):

Honestly that just sounds like "keep the DxEngine" which is a lot easier ;)

We'd have to switch between the two engines, if you connect via RDP to your PC.
Also, we can reuse the DirectWrite parsing in AtlasEngine and deduplicate the code.

@lhecker commented on GitHub (May 16, 2022): > Honestly that just sounds like "keep the DxEngine" which is a lot easier ;) We'd have to switch between the two engines, if you connect via RDP to your PC. Also, we can reuse the DirectWrite parsing in AtlasEngine and deduplicate the code.
Author
Owner

@ghost commented on GitHub (Sep 13, 2022):

:tada:This issue was addressed in #13816, which has now been successfully released as Windows Terminal Preview v1.16.252.🎉

Handy links:

@ghost commented on GitHub (Sep 13, 2022): :tada:This issue was addressed in #13816, which has now been successfully released as `Windows Terminal Preview v1.16.252`.:tada: Handy links: * [Release Notes](https://github.com/microsoft/terminal/releases/tag/v1.16.252) * [Store Download](https://www.microsoft.com/store/apps/9n8g5rfz9xk3?cid=storebadge&ocid=badge)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#17462