Windows Terminal can crash if a tab is closed while data is being output #12009

Closed
opened 2026-01-31 03:03:50 +00:00 by claunia · 8 comments
Owner

Originally created by @j4james on GitHub (Jan 10, 2021).

Environment

Windows build number: Version 10.0.18363.1256
Windows Terminal version (if applicable): Commit 49d008537f

Steps to reproduce

This is rather difficult to reproduce, but here is one sequence of events that is known to trigger it sometimes.

  1. Open Windows Terminal.
  2. Open a second tab with a bash shell.
  3. Output a large file (e.g. cat /mnt/c/Windows/explorer.exe).
  4. While the content is still being output, close the tab.
  5. Repeat steps 2 to 4 until it crashes.

Expected behavior

The tab should always be able close without crashing the app.

Actual behavior

Sometimes closing the tab will cause Windows Terminal to terminate with an unhandled exception.

The problem is that the TerminalControl is trying to process output when it has already been destroyed in the main thread (when the tab was closed). The exact location of the crash will vary, but it's quite often inside the Cursor::_RedrawCursorAlways method, triggered from somewhere like Terminal::_AdjustCursorPosition when adjusting the cursor position.

The reason it doesn't crash all the time, is because the renderer is often going to be painting a frame, in which case it will have grabbed the console lock, so the TerminalConnection thread wouldn't be able to output anything. And while the renderer is painting the frame, the shutdown will be blocked in the Renderer::TriggerTeardown call while waiting for the _hPaintCompletedEvent.

If you get the timing just right, though, the frame will have just finished painting, so the _hPaintCompletedEvent will have been set, and the console won't be locked. The TerminalConnection thread would then be free to grab the lock and start processing output, and the main thread wouldn't need to wait on _hPaintCompletedEvent, so could happily kill the TerminalControl while it is still in use.

Bottom line is I think we need another lock somewhere in the shutdown sequence.

Originally created by @j4james on GitHub (Jan 10, 2021). # Environment Windows build number: Version 10.0.18363.1256 Windows Terminal version (if applicable): Commit 49d008537f0bb73c8bfc85df773a46a2b3e9ac79 # Steps to reproduce This is rather difficult to reproduce, but here is one sequence of events that is known to trigger it sometimes. 1. Open Windows Terminal. 2. Open a second tab with a bash shell. 3. Output a large file (e.g. `cat /mnt/c/Windows/explorer.exe`). 4. While the content is still being output, close the tab. 5. Repeat steps 2 to 4 until it crashes. # Expected behavior The tab should always be able close without crashing the app. # Actual behavior Sometimes closing the tab will cause Windows Terminal to terminate with an unhandled exception. The problem is that the `TerminalControl` is trying to process output when it has already been destroyed in the main thread (when the tab was closed). The exact location of the crash will vary, but it's quite often inside the `Cursor::_RedrawCursorAlways` method, triggered from somewhere like `Terminal::_AdjustCursorPosition` when adjusting the cursor position. The reason it doesn't crash all the time, is because the renderer is often going to be painting a frame, in which case it will have grabbed the console lock, so the `TerminalConnection` thread wouldn't be able to output anything. And while the renderer is painting the frame, the shutdown will be blocked in the `Renderer::TriggerTeardown` call while waiting for the `_hPaintCompletedEvent`. If you get the timing just right, though, the frame will have just finished painting, so the `_hPaintCompletedEvent` will have been set, and the console won't be locked. The `TerminalConnection` thread would then be free to grab the lock and start processing output, and the main thread wouldn't need to wait on `_hPaintCompletedEvent`, so could happily kill the `TerminalControl` while it is still in use. Bottom line is I think we need another lock somewhere in the shutdown sequence.
Author
Owner

@DHowett commented on GitHub (Jan 11, 2021):

I think there's another one somewhere in here... we're running a pattern search on the text buffer after a delay, and I fear sometimes we attempt to do it after the buffer's long gone.

@DHowett commented on GitHub (Jan 11, 2021): I think there's another one somewhere in here... we're running a pattern search on the text buffer after a delay, and I fear sometimes we attempt to do it after the buffer's long gone.
Author
Owner

@zadjii-msft commented on GitHub (Jan 11, 2021):

Should this be 1.6 blocking? If the root cause is the pattern matching, then maybe. If it's not, and the solution is too complicated, then let's just leave it for 2.0

@zadjii-msft commented on GitHub (Jan 11, 2021): Should this be 1.6 blocking? If the root cause is the pattern matching, then _maybe_. If it's not, and the solution is too complicated, then let's just leave it for 2.0
Author
Owner

@j4james commented on GitHub (Jan 12, 2021):

I've just had a look at the pattern matching code, and it looks to me like it's being dispatched on the main thread, so it can't be running at the same time the tab is being shutdown, so I wouldn't have thought it would be a problem (not claiming that with any certainty though). But if there is a problem there, I'd expect it to be completely different to the crash I'm seeing.

@j4james commented on GitHub (Jan 12, 2021): I've just had a look at the pattern matching code, and it looks to me like it's being dispatched on the main thread, so it can't be running at the same time the tab is being shutdown, so I wouldn't have thought it would be a problem (not claiming that with any certainty though). But if there is a problem there, I'd expect it to be completely different to the crash I'm seeing.
Author
Owner

@DHowett commented on GitHub (Jan 28, 2021):

Ugh. We'll probably need to service 1.6 with a fix for this. P1, Sev-Crash, and we shipped anyway? What's wrong with me? Bad release engineer!

@DHowett commented on GitHub (Jan 28, 2021): Ugh. We'll probably need to service 1.6 with a fix for this. P1, Sev-Crash, and we shipped anyway? What's wrong with me? Bad release engineer!
Author
Owner

@j4james commented on GitHub (Jan 31, 2021):

I think I have a solution for this, but first I want to describe how you can consistently reproduce the issue. This requires running the app from within a debugger so you can freeze threads so they're aligned in exactly the right way.

There are three threads we need to worry about:

  1. The render thread, which is going to be in a loop in RenderThread::_ThreadProc, where it resets _hPaintCompletedEvent, paints a frame (during which the terminal will be locked), and then sets _hPaintCompletedEvent once it is done.
  2. The connection thread, which handles the output. This acquires the terminal lock in the Terminal::Write method, processes the output, and then release the lock again. At some point it'll likely try to access the renderer, e.g. in _RedrawCursorAlways.
  3. The main thread, which handles the closing of the window. This ends up in TermControl::Close, where it will stop the connection accepting new output, trigger the teardown of the renderer (which requires waiting for _hPaintCompletedEvent), and then destroy the renderer.

To cause the crash, we need to align these threads as shown in the table below. In this scenario it's best to run with a single tab, so there's no confusion about which of the render and connection threads you're dealing with.

Render Thread Connection Thread Main Thread
  TermControl::Close
  (1. break here and freeze)
ResetEvent(_hPaintCompletedEvent)
Renderer::PaintFrame
SetEvent(_hPaintCompletedEvent)
(2. break here and freeze)
  LockForWriting
  Cursor::_RedrawCursorAlways
  (3. break here and freeze)
  (4. Thaw and break at method end)
  Stop accepting output
  Renderer::TriggerTeardown
(5. Thaw and run)
Render thread ends
  Destroy renderer
  (6. Thaw and run)
  _parentBuffer.GetRenderTarget()
  Unlock LockForWriting
  1. Place a breakpoint at the start of TermControl::Close. Begin outputting a large file in the terminal and close the window while that is still in progress. When the breakpoint is reached, freeze the main thread.
  2. Place a breakpoint following the SetEvent call in RenderThread::_ThreadProc (this may require looking at the disassembly if you're using a release build). Once the breakpoint is reached, freeze the thread. At this point the terminal lock is released, so it can't hold up the connection thread, and and the event is set so it can't stop the main thread from tearing down the renderer.
  3. Place a breakpoint in the Cursor::_RedrawCursorAlway method, run until that is reached, and then freeze the thread. We're now about to access the renderer, while the main thread is free to destroy it.
  4. Place a breakpoint in the main thread at the end of TermControl::Close and thaw the thread. This will trigger the renderer teardown, then then begin to destroy the renderer, but it will block while the render thread is still active.
  5. Thaw the render thread. Once it exits, that will unblock the main thread, and allow it to finished destroying the renderer. When it breaks at the end of the Close method, freeze the thread so we don't close completely yet.
  6. Thaw the connection thread. It will then attempt to access the renderer via GetRenderTarget, but since the renderer has been destroyed, it'll most likely crash.

To fix this we need to make sure the renderer isn't destroyed while the connection thread is still in a position to access it. I think this can be achieved by acquiring the terminal lock, and immediately releasing it, just before the call to TriggerTeardown.

That should force it to wait until the connection thread is unlocked before destroying the renderer. And there shouldn't be any chance of the connection thread reacquiring the lock, because we've already stopped accepting output.

Note that we can't leave the terminal locked while TriggerTeardown is called, because it'll be waiting for the render thread to finish. And if the renderer is blocked while waiting for the terminal lock, that would result in a deadlock.

@j4james commented on GitHub (Jan 31, 2021): I think I have a solution for this, but first I want to describe how you can consistently reproduce the issue. This requires running the app from within a debugger so you can freeze threads so they're aligned in exactly the right way. There are three threads we need to worry about: 1. The render thread, which is going to be in a loop in `RenderThread::_ThreadProc`, where it resets `_hPaintCompletedEvent`, paints a frame (during which the terminal will be locked), and then sets `_hPaintCompletedEvent` once it is done. 2. The connection thread, which handles the output. This acquires the terminal lock in the `Terminal::Write` method, processes the output, and then release the lock again. At some point it'll likely try to access the renderer, e.g. in `_RedrawCursorAlways`. 3. The main thread, which handles the closing of the window. This ends up in `TermControl::Close`, where it will stop the connection accepting new output, trigger the teardown of the renderer (which requires waiting for `_hPaintCompletedEvent`), and then destroy the renderer. To cause the crash, we need to align these threads as shown in the table below. In this scenario it's best to run with a single tab, so there's no confusion about which of the render and connection threads you're dealing with. Render Thread | Connection Thread | Main Thread ------------------------------- | ------------------------------------- | --------------------------------   | | TermControl::Close   | | (1. break here and freeze) ResetEvent(_hPaintCompletedEvent)| | Renderer::PaintFrame | | SetEvent(_hPaintCompletedEvent) | | (2. break here and freeze) | |   | LockForWriting |   | Cursor::_RedrawCursorAlways |   | (3. break here and freeze) |   | | (4. Thaw and break at method end)   | | Stop accepting output   | | Renderer::TriggerTeardown (5. Thaw and run) | | Render thread ends | |   | | Destroy renderer   | (6. Thaw and run) |   | _parentBuffer.GetRenderTarget() |   | Unlock LockForWriting | 1. Place a breakpoint at the start of `TermControl::Close`. Begin outputting a large file in the terminal and close the window while that is still in progress. When the breakpoint is reached, freeze the main thread. 2. Place a breakpoint following the `SetEvent` call in `RenderThread::_ThreadProc` (this may require looking at the disassembly if you're using a release build). Once the breakpoint is reached, freeze the thread. At this point the terminal lock is released, so it can't hold up the connection thread, and and the event is set so it can't stop the main thread from tearing down the renderer. 3. Place a breakpoint in the `Cursor::_RedrawCursorAlway` method, run until that is reached, and then freeze the thread. We're now about to access the renderer, while the main thread is free to destroy it. 4. Place a breakpoint in the main thread at the end of `TermControl::Close` and thaw the thread. This will trigger the renderer teardown, then then begin to destroy the renderer, but it will block while the render thread is still active. 5. Thaw the render thread. Once it exits, that will unblock the main thread, and allow it to finished destroying the renderer. When it breaks at the end of the `Close` method, freeze the thread so we don't close completely yet. 6. Thaw the connection thread. It will then attempt to access the renderer via `GetRenderTarget`, but since the renderer has been destroyed, it'll most likely crash. To fix this we need to make sure the renderer isn't destroyed while the connection thread is still in a position to access it. I think this can be achieved by acquiring the terminal lock, and immediately releasing it, just before the call to `TriggerTeardown`. That should force it to wait until the connection thread is unlocked before destroying the renderer. And there shouldn't be any chance of the connection thread reacquiring the lock, because we've already stopped accepting output. Note that we can't leave the terminal locked while `TriggerTeardown` is called, because it'll be waiting for the render thread to finish. And if the renderer is blocked while waiting for the terminal lock, that would result in a deadlock.
Author
Owner

@zadjii-msft commented on GitHub (Feb 1, 2021):

😮

what a great writeup, for what amounts to a 1 line change.

@zadjii-msft commented on GitHub (Feb 1, 2021): 😮 what a great writeup, for what amounts to a 1 line change.
Author
Owner

@ghost commented on GitHub (Feb 11, 2021):

:tada:This issue was addressed in #8982, which has now been successfully released as Windows Terminal v1.5.10411.0.🎉

Handy links:

@ghost commented on GitHub (Feb 11, 2021): :tada:This issue was addressed in #8982, which has now been successfully released as `Windows Terminal v1.5.10411.0`.:tada: Handy links: * [Release Notes](https://github.com/microsoft/terminal/releases/tag/v1.5.10411.0) * [Store Download](https://www.microsoft.com/store/apps/9n8g5rfz9xk3?cid=storebadge&ocid=badge)
Author
Owner

@ghost commented on GitHub (Feb 11, 2021):

:tada:This issue was addressed in #8982, which has now been successfully released as Windows Terminal Preview v1.6.10412.0.🎉

Handy links:

@ghost commented on GitHub (Feb 11, 2021): :tada:This issue was addressed in #8982, which has now been successfully released as `Windows Terminal Preview v1.6.10412.0`.:tada: Handy links: * [Release Notes](https://github.com/microsoft/terminal/releases/tag/v1.6.10412.0) * [Store Download](https://www.microsoft.com/store/apps/9n8g5rfz9xk3?cid=storebadge&ocid=badge)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#12009