ClosePseudoConsole API hanging #2535

Closed
opened 2026-01-30 22:57:37 +00:00 by claunia · 58 comments
Owner

Originally created by @sbatten on GitHub (Jul 3, 2019).

Environment

Windows build number: Microsoft Windows [Version 10.0.18362.175]

Any other software? Visual Studio Code Insiders

Steps to reproduce

  1. Launch VS Code Insiders.
  2. Ensure using conpty with setting: terminal.integrated.windowsEnableConpty
  3. Open a terminal with Ctrl+`
  4. Close the terminal with the trash can icon
  5. Repeat 3 and 4 until the window hangs. It takes around 10 retries on my machine.

Expected behavior

No hanging.

Actual behavior

The window hangs. Using windbg, I see the following:

conpty_stack
conpty_locals
conpty_leftovers

Notice the many leftover conhosts in the task manager as well as the hang in ClosePseudoConsole.

Here is the link to the code in node-pty calling into the PseudoConsole API.
04445ed76f/src/win/conpty.cc (L429)

/cc @daimms

Originally created by @sbatten on GitHub (Jul 3, 2019). <!-- 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 I ACKNOWLEDGE THE FOLLOWING BEFORE PROCEEDING: 1. If I delete this entire template and go my own path, the core team may close my issue without further explanation or engagement. 2. If I list multiple bugs/concerns in this one issue, the core team may close my issue without further explanation or engagement. 3. If I write an issue that has many duplicates, the core team may close my issue without further explanation or engagement (and without necessarily spending time to find the exact duplicate ID number). 4. If I leave the title incomplete when filing the issue, the core team may close my issue without further explanation or engagement. 5. If I file something completely blank in the body, the core team may close my issue without further explanation or engagement. All good? Then proceed! --> <!-- This bug tracker is monitored by Windows Terminal development team and other technical folks. **Important: When reporting BSODs or security issues, DO NOT attach memory dumps, logs, or traces to Github issues**. Instead, send dumps/traces to secure@microsoft.com, referencing this GitHub issue. Please use this form and describe your issue, concisely but precisely, with as much detail as possible. --> # Environment ```none Windows build number: Microsoft Windows [Version 10.0.18362.175] Any other software? Visual Studio Code Insiders ``` # Steps to reproduce <!-- A description of how to trigger this bug. --> 1. Launch VS Code Insiders. 2. Ensure using conpty with setting: `terminal.integrated.windowsEnableConpty` 3. Open a terminal with Ctrl+` 4. Close the terminal with the trash can icon 5. Repeat 3 and 4 until the window hangs. It takes around 10 retries on my machine. # Expected behavior No hanging. # Actual behavior The window hangs. Using windbg, I see the following: ![conpty_stack](https://user-images.githubusercontent.com/6561887/60627570-4dae2800-9da3-11e9-9ff4-9aefa9ed5b01.png) ![conpty_locals](https://user-images.githubusercontent.com/6561887/60627571-4dae2800-9da3-11e9-9768-0adfc21f1453.png) ![conpty_leftovers](https://user-images.githubusercontent.com/6561887/60627572-4dae2800-9da3-11e9-91fd-4279f1d9137d.png) Notice the many leftover conhosts in the task manager as well as the hang in ClosePseudoConsole. Here is the link to the code in node-pty calling into the PseudoConsole API. https://github.com/microsoft/node-pty/blob/04445ed76f90b4f56a190982ea2d4fcdd22a0ee7/src/win/conpty.cc#L429 /cc @daimms
Author
Owner

@DHowett-MSFT commented on GitHub (Jul 8, 2019):

This just needs somebody to sit down to debug. Thanks for the report.

@DHowett-MSFT commented on GitHub (Jul 8, 2019): This just needs somebody to sit down to debug. Thanks for the report.
Author
Owner

@miniksa commented on GitHub (Aug 7, 2019):

This is because the ConPTY is being started with CreatePseudoConsole() with the flag PSEUDOCONSOLE_INHERIT_CURSOR. There's an indefinite wait for the calling terminal to respond with the cursor position before the startup lock is freed for other threads to use. A different thread gets the shutdown message and is attempting to acquire the same lock to clean things up before exiting.

It looks like if the PTY starts up fast enough so that the shutdown message comes after the cursor is inherited OR if the shutdown happens before the PTY starts asking for the cursor inherit, then everything is good.

Given this is a race condition, I can't really blame it on the caller holding it wrong. I'll work to resolve it such that a shutdown is a valid way of halting the request for the cursor position while it is starting up.

@miniksa commented on GitHub (Aug 7, 2019): This is because the ConPTY is being started with `CreatePseudoConsole()` with the flag `PSEUDOCONSOLE_INHERIT_CURSOR`. There's an indefinite wait for the calling terminal to respond with the cursor position before the startup lock is freed for other threads to use. A different thread gets the shutdown message and is attempting to acquire the same lock to clean things up before exiting. It looks like if the PTY starts up fast enough so that the shutdown message comes after the cursor is inherited OR if the shutdown happens before the PTY starts asking for the cursor inherit, then everything is good. Given this is a race condition, I can't really blame it on the caller holding it wrong. I'll work to resolve it such that a shutdown is a valid way of halting the request for the cursor position while it is starting up.
Author
Owner

@miniksa commented on GitHub (Aug 7, 2019):

Oh, yeah, conhost is open source now.

This is the VtIo startup code that is waiting to hear the cursor position from the calling terminal:
8fa42e09df/src/host/VtIo.cpp (L238-L255)

And this is the Signal thread that has noticed the shutdown and is attempting to acquire the lock so it can finish closing client process state:
8fa42e09df/src/host/output.cpp (L503-L527)

Here is the stack of the conhost threads when it is stuck:

   0  Id: f84.9168 Suspend: 1 Teb: 0000002d`0bbc3000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000002d`0bcfebf8 00007ffb`69b38b6b ntdll!ZwReadFile+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 227] 
01 0000002d`0bcfec00 00007ff6`650e4d1e KERNELBASE!ReadFile+0x7b [minkernel\kernelbase\filehops.c @ 1397] 
02 0000002d`0bcfec70 00007ff6`650e479c conhost!Microsoft::Console::VtInputThread::DoReadInput+0x4a [onecore\windows\core\console\open\src\host\vtinputthread.cpp @ 111] 
03 0000002d`0bcfedd0 00007ff6`650d0b22 conhost!Microsoft::Console::VirtualTerminal::VtIo::StartIfNeeded+0xc0 [onecore\windows\core\console\open\src\host\vtio.cpp @ 251] 
04 0000002d`0bcfee00 00007ff6`650af84f conhost!ConsoleAllocateConsole+0x18d32 [onecore\windows\core\console\open\src\host\srvinit.cpp @ 609] 
05 0000002d`0bcfeeb0 00007ff6`650aaaec conhost!IoDispatchers::ConsoleHandleConnectionRequest+0x2cb [onecore\windows\core\console\open\src\server\iodispatchers.cpp @ 183] 
06 (Inline Function) --------`-------- conhost!IoSorter::ServiceIoOperation+0x15f [onecore\windows\core\console\open\src\server\iosorter.cpp @ 37] 
07 0000002d`0bcff9b0 00007ffb`6b257034 conhost!ConsoleIoThread+0x23c [onecore\windows\core\console\open\src\host\srvinit.cpp @ 667] 
08 0000002d`0bcffb30 00007ffb`6c27b0a1 KERNEL32!BaseThreadInitThunk+0x14 [clientcore\base\win32\client\thread.c @ 64] 
09 0000002d`0bcffb60 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] 

   1  Id: f84.90cc Suspend: 1 Teb: 0000002d`0bbc5000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000002d`0bd7f508 00007ffb`6c28fb61 ntdll!ZwWaitForAlertByThreadId+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 3875] 
01 0000002d`0bd7f510 00007ffb`6c28fa16 ntdll!RtlpWaitOnAddressWithTimeout+0x81 [minkernel\ntos\rtl\waitaddr.c @ 851] 
02 0000002d`0bd7f540 00007ffb`6c28f83d ntdll!RtlpWaitOnAddress+0xae [minkernel\ntos\rtl\waitaddr.c @ 1094] 
03 0000002d`0bd7f5b0 00007ffb`6c239e24 ntdll!RtlpWaitOnCriticalSection+0xfd [minkernel\ntos\rtl\resource.c @ 1610] 
04 0000002d`0bd7f690 00007ffb`6c239c52 ntdll!RtlpEnterCriticalSectionContended+0x1c4 [minkernel\ntos\rtl\resource.c @ 2317] 
05 0000002d`0bd7f6f0 00007ff6`650ed4c1 ntdll!RtlEnterCriticalSection+0x42 [minkernel\ntos\rtl\resource.c @ 1923] 
06 (Inline Function) --------`-------- conhost!CONSOLE_INFORMATION::LockConsole+0xe [onecore\windows\core\console\open\src\host\consoleinformation.cpp @ 63] 
07 (Inline Function) --------`-------- conhost!LockConsole+0xe [onecore\windows\core\console\open\src\host\handle.cpp @ 16] 
08 0000002d`0bd7f720 00007ff6`650e536a conhost!CloseConsoleProcessState+0x2d [onecore\windows\core\console\open\src\host\output.cpp @ 525] 
09 (Inline Function) --------`-------- conhost!Microsoft::Console::PtySignalInputThread::_Shutdown+0x5 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 203] 
0a 0000002d`0bd7f750 00007ff6`650e53b1 conhost!Microsoft::Console::PtySignalInputThread::_GetData+0x56 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 158] 
0b 0000002d`0bd7f790 00007ffb`6b257034 conhost!Microsoft::Console::PtySignalInputThread::_InputThread+0x25 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 83] 
0c 0000002d`0bd7f7d0 00007ffb`6c27b0a1 KERNEL32!BaseThreadInitThunk+0x14 [clientcore\base\win32\client\thread.c @ 64] 
0d 0000002d`0bd7f800 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] 

The resolution is likely to make the ReadFile inside VtInputThread.cpp and any other synchronous ReadFile and WriteFile operations into OVERLAPPED ones that are asynchronous immediate calls that then WaitForMultipleObjects on the event in the overlapped structure AND on an event that can be signaled by the signal thread that says "Hey, give that up, we're going down!"

8fa42e09df/src/host/VtInputThread.cpp (L111)

@miniksa commented on GitHub (Aug 7, 2019): Oh, yeah, conhost is open source now. This is the VtIo startup code that is waiting to hear the cursor position from the calling terminal: https://github.com/microsoft/terminal/blob/8fa42e09dfc6cd57d29e517a002d8c7a99e2aebd/src/host/VtIo.cpp#L238-L255 And this is the Signal thread that has noticed the shutdown and is attempting to acquire the lock so it can finish closing client process state: https://github.com/microsoft/terminal/blob/8fa42e09dfc6cd57d29e517a002d8c7a99e2aebd/src/host/output.cpp#L503-L527 Here is the stack of the conhost threads when it is stuck: ``` 0 Id: f84.9168 Suspend: 1 Teb: 0000002d`0bbc3000 Unfrozen # Child-SP RetAddr Call Site 00 0000002d`0bcfebf8 00007ffb`69b38b6b ntdll!ZwReadFile+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 227] 01 0000002d`0bcfec00 00007ff6`650e4d1e KERNELBASE!ReadFile+0x7b [minkernel\kernelbase\filehops.c @ 1397] 02 0000002d`0bcfec70 00007ff6`650e479c conhost!Microsoft::Console::VtInputThread::DoReadInput+0x4a [onecore\windows\core\console\open\src\host\vtinputthread.cpp @ 111] 03 0000002d`0bcfedd0 00007ff6`650d0b22 conhost!Microsoft::Console::VirtualTerminal::VtIo::StartIfNeeded+0xc0 [onecore\windows\core\console\open\src\host\vtio.cpp @ 251] 04 0000002d`0bcfee00 00007ff6`650af84f conhost!ConsoleAllocateConsole+0x18d32 [onecore\windows\core\console\open\src\host\srvinit.cpp @ 609] 05 0000002d`0bcfeeb0 00007ff6`650aaaec conhost!IoDispatchers::ConsoleHandleConnectionRequest+0x2cb [onecore\windows\core\console\open\src\server\iodispatchers.cpp @ 183] 06 (Inline Function) --------`-------- conhost!IoSorter::ServiceIoOperation+0x15f [onecore\windows\core\console\open\src\server\iosorter.cpp @ 37] 07 0000002d`0bcff9b0 00007ffb`6b257034 conhost!ConsoleIoThread+0x23c [onecore\windows\core\console\open\src\host\srvinit.cpp @ 667] 08 0000002d`0bcffb30 00007ffb`6c27b0a1 KERNEL32!BaseThreadInitThunk+0x14 [clientcore\base\win32\client\thread.c @ 64] 09 0000002d`0bcffb60 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] 1 Id: f84.90cc Suspend: 1 Teb: 0000002d`0bbc5000 Unfrozen # Child-SP RetAddr Call Site 00 0000002d`0bd7f508 00007ffb`6c28fb61 ntdll!ZwWaitForAlertByThreadId+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 3875] 01 0000002d`0bd7f510 00007ffb`6c28fa16 ntdll!RtlpWaitOnAddressWithTimeout+0x81 [minkernel\ntos\rtl\waitaddr.c @ 851] 02 0000002d`0bd7f540 00007ffb`6c28f83d ntdll!RtlpWaitOnAddress+0xae [minkernel\ntos\rtl\waitaddr.c @ 1094] 03 0000002d`0bd7f5b0 00007ffb`6c239e24 ntdll!RtlpWaitOnCriticalSection+0xfd [minkernel\ntos\rtl\resource.c @ 1610] 04 0000002d`0bd7f690 00007ffb`6c239c52 ntdll!RtlpEnterCriticalSectionContended+0x1c4 [minkernel\ntos\rtl\resource.c @ 2317] 05 0000002d`0bd7f6f0 00007ff6`650ed4c1 ntdll!RtlEnterCriticalSection+0x42 [minkernel\ntos\rtl\resource.c @ 1923] 06 (Inline Function) --------`-------- conhost!CONSOLE_INFORMATION::LockConsole+0xe [onecore\windows\core\console\open\src\host\consoleinformation.cpp @ 63] 07 (Inline Function) --------`-------- conhost!LockConsole+0xe [onecore\windows\core\console\open\src\host\handle.cpp @ 16] 08 0000002d`0bd7f720 00007ff6`650e536a conhost!CloseConsoleProcessState+0x2d [onecore\windows\core\console\open\src\host\output.cpp @ 525] 09 (Inline Function) --------`-------- conhost!Microsoft::Console::PtySignalInputThread::_Shutdown+0x5 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 203] 0a 0000002d`0bd7f750 00007ff6`650e53b1 conhost!Microsoft::Console::PtySignalInputThread::_GetData+0x56 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 158] 0b 0000002d`0bd7f790 00007ffb`6b257034 conhost!Microsoft::Console::PtySignalInputThread::_InputThread+0x25 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 83] 0c 0000002d`0bd7f7d0 00007ffb`6c27b0a1 KERNEL32!BaseThreadInitThunk+0x14 [clientcore\base\win32\client\thread.c @ 64] 0d 0000002d`0bd7f800 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] ``` The resolution is likely to make the `ReadFile` inside `VtInputThread.cpp` and any other synchronous `ReadFile` and `WriteFile` operations into `OVERLAPPED` ones that are asynchronous immediate calls that then `WaitForMultipleObjects` on the event in the overlapped structure AND on an event that can be signaled by the signal thread that says "Hey, give that up, we're going down!" https://github.com/microsoft/terminal/blob/8fa42e09dfc6cd57d29e517a002d8c7a99e2aebd/src/host/VtInputThread.cpp#L111
Author
Owner

@zadjii-msft commented on GitHub (Aug 8, 2019):

@Tyriar You might be able to work around this by not passing PSEUDOCONSOLE_INHERIT_CURSOR to CreatePseudoConsole. I don't think you'll need to be doing anything to inherit the cursor position in your scenario, right? The conpty is always being started in a new terminal, right?

Disabling PSEUDOCONSOLE_INHERIT_CURSOR will make conpty assume that the cursor position is starting at 0,0 in an empty buffer, but for VsCode that seems like a reasonable assumption.

(We should definitely still fix the thing that @miniksa found)

@zadjii-msft commented on GitHub (Aug 8, 2019): @Tyriar You might be able to work around this by not passing `PSEUDOCONSOLE_INHERIT_CURSOR` to `CreatePseudoConsole`. I don't think you'll need to be doing anything to inherit the cursor position in your scenario, right? The conpty is always being started in a _new_ terminal, right? Disabling `PSEUDOCONSOLE_INHERIT_CURSOR` will make conpty assume that the cursor position is starting at 0,0 in an empty buffer, but for VsCode that seems like a reasonable assumption. (We should definitely still fix the thing that @miniksa found)
Author
Owner

@Tyriar commented on GitHub (Aug 8, 2019):

We added that because for tasks we write to the terminal before the process is launched, we could probably avoid almost all crashes by just using it when a terminal has already been written too which happens pretty infrequently. I'll start the conversation for the workaround in the vscode issue, thanks.

@Tyriar commented on GitHub (Aug 8, 2019): We added that because for tasks we write to the terminal before the process is launched, we could probably avoid almost all crashes by just using it when a terminal has already been written too which happens pretty infrequently. I'll start the conversation for the workaround in the vscode issue, thanks.
Author
Owner

@Tyriar commented on GitHub (Aug 16, 2019):

The workaround might not be fixing the issue https://github.com/microsoft/vscode/issues/76548#issuecomment-522096999

@Tyriar commented on GitHub (Aug 16, 2019): The workaround might not be fixing the issue https://github.com/microsoft/vscode/issues/76548#issuecomment-522096999
Author
Owner

@miniksa commented on GitHub (Aug 17, 2019):

@Tyriar, did this behavior change between Code Insiders July 2019 and Code Insiders August 2019?

I was attempting to fix this issue and then VS Code updated itself and now I can no longer repro it the same way.

@miniksa commented on GitHub (Aug 17, 2019): @Tyriar, did this behavior change between Code Insiders July 2019 and Code Insiders August 2019? I was attempting to fix this issue and then VS Code updated itself and now I can no longer repro it the same way.
Author
Owner

@miniksa commented on GitHub (Aug 17, 2019):

@Tyriar, did this behavior change between Code Insiders July 2019 and Code Insiders August 2019?

I was attempting to fix this issue and then VS Code updated itself and now I can no longer repro it the same way.

Ugh, it looks like it did. Now to find where the heck I can get the July 2019 one again and have it not auto update.

@miniksa commented on GitHub (Aug 17, 2019): > @Tyriar, did this behavior change between Code Insiders July 2019 and Code Insiders August 2019? > > I was attempting to fix this issue and then VS Code updated itself and now I can no longer repro it the same way. Ugh, it looks like it did. Now to find where the heck I can get the July 2019 one again and have it not auto update.
Author
Owner

@Tyriar commented on GitHub (Aug 17, 2019):

@miniksa installer/zip at the top: https://code.visualstudio.com/updates/v1_36, you can set "update.mode": "none" to disable automatic updates

@Tyriar commented on GitHub (Aug 17, 2019): @miniksa installer/zip at the top: https://code.visualstudio.com/updates/v1_36, you can set `"update.mode": "none"` to disable automatic updates
Author
Owner

@miniksa commented on GitHub (Aug 17, 2019):

Beautiful, thank you! @tyriar

@miniksa commented on GitHub (Aug 17, 2019): Beautiful, thank you! @tyriar
Author
Owner

@Tyriar commented on GitHub (Aug 20, 2019):

It's sounding like it's still happening when we use that flag (since we use it on tasks), just a lot more often than I would have expected based on my experience reproducing https://github.com/microsoft/vscode/issues/71966#issuecomment-522848243

We can't really disable it for tasks as then the updates don't get printed to the terminal, that was another bug: https://github.com/microsoft/terminal/issues/919

@Tyriar commented on GitHub (Aug 20, 2019): It's sounding like it's still happening when we use that flag (since we use it on tasks), just a lot more often than I would have expected based on my experience reproducing https://github.com/microsoft/vscode/issues/71966#issuecomment-522848243 We can't really disable it for tasks as then the updates don't get printed to the terminal, that was another bug: https://github.com/microsoft/terminal/issues/919
Author
Owner

@miniksa commented on GitHub (Aug 20, 2019):

@Tyriar, OK. I'm working on a fix for the one specific stack/repro I had right now above.

I have a solution for that particular race, but it revealed a deadlock behind that one. It looks like we're attempting to further paint one last frame out the output channel before everything is torn down. My suspicion here is that the write of the last frame is stuck because the read operation is happening on the same thread on your side of the fence as the call to close the pseudo console. So there's a deadlock as the output channel isn't being drained so the paint frame doesn't think it is done and is holding the session open until it is drained.

Again, not something that is specifically your fault. We should be more robust than that.

A workaround for this case might be to close the handles you're not intending to listen to anymore before calling ClosePseudoConsole. Then we can't get stuck on sending you information during teardown because the pipes will be broken. There's plenty of code handling teardown when the PTY pipes are broken.

It's taking me a while to come up with a complete solution here because it seems like there's a wide spread problem in our code with reliability around synchronization and threading for the PTY. Thanks for your patience.

@miniksa commented on GitHub (Aug 20, 2019): @Tyriar, OK. I'm working on a fix for the one specific stack/repro I had right now above. I have a solution for that particular race, but it revealed a deadlock behind that one. It looks like we're attempting to further paint one last frame out the output channel before everything is torn down. My suspicion here is that the write of the last frame is stuck because the read operation is happening on the same thread on your side of the fence as the call to close the pseudo console. So there's a deadlock as the output channel isn't being drained so the paint frame doesn't think it is done and is holding the session open until it is drained. Again, not something that is specifically your fault. We should be more robust than that. A workaround for this case might be to close the handles you're not intending to listen to anymore before calling ClosePseudoConsole. Then we can't get stuck on sending you information during teardown because the pipes will be broken. There's plenty of code handling teardown when the PTY pipes are broken. It's taking me a while to come up with a complete solution here because it seems like there's a wide spread problem in our code with reliability around synchronization and threading for the PTY. Thanks for your patience.
Author
Owner

@Tyriar commented on GitHub (Aug 20, 2019):

@miniksa so 3e645898f8/src/win/conpty.cc (L438) should move above the ClosePseudoConsole call?

@Tyriar commented on GitHub (Aug 20, 2019): @miniksa so https://github.com/microsoft/node-pty/blob/3e645898f83370db2894cdde09ac180235e1cfb7/src/win/conpty.cc#L438 should move above the ClosePseudoConsole call?
Author
Owner

@miniksa commented on GitHub (Aug 20, 2019):

@miniksa so 3e645898f8/src/win/conpty.cc (L438) should move above the ClosePseudoConsole call?

No, I mean the handles hIn and hOut inside your pty_baton that you got from opening up the session. hShell looks like a process handle, not pipe handles.

@miniksa commented on GitHub (Aug 20, 2019): > @miniksa so https://github.com/microsoft/node-pty/blob/3e645898f83370db2894cdde09ac180235e1cfb7/src/win/conpty.cc#L438 should move above the ClosePseudoConsole call? No, I mean the handles `hIn` and `hOut` inside your `pty_baton` that you got from opening up the session. `hShell` looks like a process handle, not pipe handles.
Author
Owner

@miniksa commented on GitHub (Aug 20, 2019):

@zadjii-msft, to totally fix this I need to re-evaluate ff87190823/src/interactivity/base/ServiceLocator.cpp (L39-L46).

I don't have full context on the issue you fixed in the past, so I don't want to break that while I'm fixing this.

Right now I'm deadlocked on the Pty Signal Input thread getting a shutdown message and the TriggerTeardown sending a final paint to the Output channel. But the same process that sent the signal isn't consuming the Output channel so it deadlocks here.

If I in any way attempt to make this final paint asynchronous, there's then no guarantee that it will actually write before our process is torn down because we're calling RundownAndExit here.

I'm trying to determine if there's a difference in context between this type of shutdown and the one that you fixed in MSFT: 15506250.

If not, then we probably need to make it so the Pty Signal receiving a shutdown doesn't force the process closed through RundownAndExit but rather politely notifies everyone else that it's time to go and just lets whatever happen happen (and fix up any further issues that leave behind conhosts unexpectedly from there).

Edit:

Locked stack:

 	conhost.exe!Microsoft::Console::Render::VtEngine::_Flush() Line 109	C++
 	conhost.exe!Microsoft::Console::Render::VtEngine::EndPaint() Line 80	C++
 	conhost.exe!Microsoft::Console::Render::XtermEngine::EndPaint() Line 118	C++
 	conhost.exe!Microsoft::Console::Render::Renderer::_PaintFrameForEngine::__l2::<lambda>() Line 96	C++
 	conhost.exe!wil::details::lambda_call<void <lambda>(void) >::reset() Line 459	C++
 	conhost.exe!Microsoft::Console::Render::Renderer::_PaintFrameForEngine(Microsoft::Console::Render::IRenderEngine * const pEngine) Line 127	C++
 	conhost.exe!Microsoft::Console::Render::Renderer::TriggerTeardown() Line 260	C++
>	conhost.exe!Microsoft::Console::Interactivity::ServiceLocator::RundownAndExit(const HRESULT hr) Line 65	C++
 	conhost.exe!Microsoft::Console::PtySignalInputThread::_Shutdown() Line 215	C++
 	conhost.exe!Microsoft::Console::PtySignalInputThread::_GetData(void * const pBuffer, const unsigned long cbBuffer) Line 149	C++
 	conhost.exe!Microsoft::Console::PtySignalInputThread::_InputThread() Line 83	C++
 	conhost.exe!Microsoft::Console::PtySignalInputThread::StaticThreadProc(void * lpParameter) Line 59	C++

Oh, also a disconnect request is coming in here at the same time from the client:

 	conhost.exe!CONSOLE_INFORMATION::LockConsole() Line 64	C++
 	conhost.exe!LockConsole() Line 17	C++
>	conhost.exe!RemoveConsole(ConsoleProcessHandle * ProcessData) Line 197	C++
 	conhost.exe!IoDispatchers::ConsoleClientDisconnectRoutine(_CONSOLE_API_MSG * pMessage) Line 275	C++
 	conhost.exe!IoSorter::ServiceIoOperation(_CONSOLE_API_MSG * const pMsg, _CONSOLE_API_MSG * * ReplyMsg) Line 41	C++
 	conhost.exe!ConsoleIoThread(void * __formal) Line 668	C++
@miniksa commented on GitHub (Aug 20, 2019): @zadjii-msft, to totally fix this I need to re-evaluate https://github.com/microsoft/terminal/blob/ff87190823ea699a7eee4dafda34358ee0caffa4/src/interactivity/base/ServiceLocator.cpp#L39-L46. I don't have full context on the issue you fixed in the past, so I don't want to break that while I'm fixing this. Right now I'm deadlocked on the Pty Signal Input thread getting a shutdown message and the `TriggerTeardown` sending a final paint to the Output channel. But the same process that sent the signal isn't consuming the Output channel so it deadlocks here. If I in any way attempt to make this final paint asynchronous, there's then no guarantee that it will actually write before our process is torn down because we're calling RundownAndExit here. I'm trying to determine if there's a difference in context between this type of shutdown and the one that you fixed in MSFT: 15506250. If not, then we probably need to make it so the Pty Signal receiving a shutdown doesn't force the process closed through RundownAndExit but rather politely notifies everyone else that it's time to go and just lets whatever happen happen (and fix up any further issues that leave behind conhosts unexpectedly from there). Edit: Locked stack: ``` conhost.exe!Microsoft::Console::Render::VtEngine::_Flush() Line 109 C++ conhost.exe!Microsoft::Console::Render::VtEngine::EndPaint() Line 80 C++ conhost.exe!Microsoft::Console::Render::XtermEngine::EndPaint() Line 118 C++ conhost.exe!Microsoft::Console::Render::Renderer::_PaintFrameForEngine::__l2::<lambda>() Line 96 C++ conhost.exe!wil::details::lambda_call<void <lambda>(void) >::reset() Line 459 C++ conhost.exe!Microsoft::Console::Render::Renderer::_PaintFrameForEngine(Microsoft::Console::Render::IRenderEngine * const pEngine) Line 127 C++ conhost.exe!Microsoft::Console::Render::Renderer::TriggerTeardown() Line 260 C++ > conhost.exe!Microsoft::Console::Interactivity::ServiceLocator::RundownAndExit(const HRESULT hr) Line 65 C++ conhost.exe!Microsoft::Console::PtySignalInputThread::_Shutdown() Line 215 C++ conhost.exe!Microsoft::Console::PtySignalInputThread::_GetData(void * const pBuffer, const unsigned long cbBuffer) Line 149 C++ conhost.exe!Microsoft::Console::PtySignalInputThread::_InputThread() Line 83 C++ conhost.exe!Microsoft::Console::PtySignalInputThread::StaticThreadProc(void * lpParameter) Line 59 C++ ``` Oh, also a disconnect request is coming in here at the same time from the client: ``` conhost.exe!CONSOLE_INFORMATION::LockConsole() Line 64 C++ conhost.exe!LockConsole() Line 17 C++ > conhost.exe!RemoveConsole(ConsoleProcessHandle * ProcessData) Line 197 C++ conhost.exe!IoDispatchers::ConsoleClientDisconnectRoutine(_CONSOLE_API_MSG * pMessage) Line 275 C++ conhost.exe!IoSorter::ServiceIoOperation(_CONSOLE_API_MSG * const pMsg, _CONSOLE_API_MSG * * ReplyMsg) Line 41 C++ conhost.exe!ConsoleIoThread(void * __formal) Line 668 C++ ```
Author
Owner

@zadjii-msft commented on GitHub (Aug 20, 2019):

So, the situation MSFT:15506250 is referring to is back in the earliest WSL tests of the conpty API. When you're using conpty to host a commandline like cmd /c dir, the client app could run and exit before we've rendered all their output. In this scenario, the last client exits, and conhost will tear itself down because there are no clients left. The "terminal" (in this scenario, the linux subsystem) isn't breaking the signal pipe to close the conpty. I believe WSL will keep reading the output pipe until the pipe is broken, and use that as a signal that the child process died.

@benhillis can keep me honest if I'm mis-remembering this.

This feels to me specifically different than the ClosePseudoConsole route, and I think you'd be fine changing it as you suggest. If someone called ClosePseudoConsole, then we could presume that they don't care anymore about something we have buffered, while in the scenario where the client exits (triggering our exit), we'd still want to deliver all the buffered output.

@zadjii-msft commented on GitHub (Aug 20, 2019): So, the situation MSFT:15506250 is referring to is back in the earliest WSL tests of the conpty API. When you're using conpty to host a commandline like `cmd /c dir`, the client app could run and exit before we've rendered all their output. In this scenario, the last client exits, and conhost will tear itself down because there are no clients left. The "terminal" (in this scenario, the linux subsystem) isn't breaking the signal pipe to close the conpty. I believe WSL will keep reading the output pipe until the pipe is broken, and use that as a signal that the child process died. @benhillis can keep me honest if I'm mis-remembering this. This feels to me specifically different than the ClosePseudoConsole route, and I think you'd be fine changing it as you suggest. If someone called ClosePseudoConsole, then we could presume that they don't care anymore about something we have buffered, while in the scenario where the client exits (triggering our exit), we'd still want to deliver all the buffered output.
Author
Owner

@Tyriar commented on GitHub (Aug 21, 2019):

I just hit the hang when closing a regular terminal (not a task) which has PSEUDOCONSOLE_INHERIT_CURSOR set to 0. Killing conhost freed up the UI again.

@Tyriar commented on GitHub (Aug 21, 2019): I just hit the hang when closing a regular terminal (not a task) which has `PSEUDOCONSOLE_INHERIT_CURSOR` set to 0. Killing conhost freed up the UI again.
Author
Owner

@miniksa commented on GitHub (Aug 22, 2019):

I just hit the hang when closing a regular terminal (not a task) which has PSEUDOCONSOLE_INHERIT_CURSOR set to 0. Killing conhost freed up the UI again.

Next time you hit the hang or any other hang like this, can you please take a process dump of the supposedly hung conhost so I can check the stacks?

If you're hitting a similar situation without the flag set, then you're getting hung in a different situation which I'm not necessarily working on fixing right now in this issue.

@miniksa commented on GitHub (Aug 22, 2019): > I just hit the hang when closing a regular terminal (not a task) which has `PSEUDOCONSOLE_INHERIT_CURSOR` set to 0. Killing conhost freed up the UI again. Next time you hit the hang or any other hang like this, can you please take a process dump of the supposedly hung conhost so I can check the stacks? If you're hitting a similar situation without the flag set, then you're getting hung in a different situation which I'm not necessarily working on fixing right now in this issue.
Author
Owner

@miniksa commented on GitHub (Aug 23, 2019):

Here is a second interlock that can occur later in the shutdown process:

.  0  Id: 29ec.45ac Suspend: 0 Teb: 000000c2`af097000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 000000c2`af2ff9c8 00007ffd`46527619 ntdll!ZwWaitForAlertByThreadId+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 3843] 
01 000000c2`af2ff9d0 00007ffd`465274d2 ntdll!RtlpWaitOnAddressWithTimeout+0x81 [minkernel\ntos\rtl\waitaddr.c @ 851] 
02 000000c2`af2ffa00 00007ffd`465272fd ntdll!RtlpWaitOnAddress+0xae [minkernel\ntos\rtl\waitaddr.c @ 1094] 
03 000000c2`af2ffa70 00007ffd`4653b586 ntdll!RtlpWaitOnCriticalSection+0xfd [minkernel\ntos\rtl\resource.c @ 1626] 
04 000000c2`af2ffb50 00007ffd`4653b3d0 ntdll!RtlpEnterCriticalSectionContended+0x1a6 [minkernel\ntos\rtl\resource.c @ 2333] 
05 000000c2`af2ffbb0 00007ff6`1c528d3e ntdll!RtlEnterCriticalSection+0x40 [minkernel\ntos\rtl\resource.c @ 1939] 
06 (Inline Function) --------`-------- conhost!CONSOLE_INFORMATION::LockConsole+0xd [onecore\windows\core\console\open\src\host\consoleinformation.cpp @ 62] 
07 (Inline Function) --------`-------- conhost!LockConsole+0xd [onecore\windows\core\console\open\src\host\handle.cpp @ 17] 
08 (Inline Function) --------`-------- conhost!RemoveConsole+0xd [onecore\windows\core\console\open\src\host\srvinit.cpp @ 198] 
09 000000c2`af2ffbe0 00007ff6`1c52f2d4 conhost!IoDispatchers::ConsoleClientDisconnectRoutine+0x46 [onecore\windows\core\console\open\src\server\iodispatchers.cpp @ 280] 
0a (Inline Function) --------`-------- conhost!IoSorter::ServiceIoOperation+0x14d [onecore\windows\core\console\open\src\server\iosorter.cpp @ 44] 
0b 000000c2`af2ffc10 00007ffd`46347bd4 conhost!ConsoleIoThread+0x224 [onecore\windows\core\console\open\src\host\srvinit.cpp @ 670] 
0c 000000c2`af2ffd90 00007ffd`4658ce71 kernel32!BaseThreadInitThunk+0x14 [base\win32\client\thread.c @ 64] 
0d 000000c2`af2ffdc0 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] 

   1  Id: 29ec.6954 Suspend: 0 Teb: 000000c2`af099000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 000000c2`af37f8e8 00007ffd`43ea50aa ntdll!ZwWriteFile+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 243] 
01 000000c2`af37f8f0 00007ff6`1c59138c KERNELBASE!WriteFile+0x7a [minkernel\kernelbase\filehops.c @ 1558] 
02 000000c2`af37f960 00007ff6`1c594abc conhost!Microsoft::Console::Render::VtEngine::_Flush+0x4c [onecore\windows\core\console\open\src\renderer\vt\state.cpp @ 112] 
03 000000c2`af37f9b0 00007ff6`1c591c14 conhost!Microsoft::Console::Render::VtEngine::EndPaint+0xbc [onecore\windows\core\console\open\src\renderer\vt\paint.cpp @ 78] 
04 000000c2`af37fa20 00007ff6`1c530732 conhost!Microsoft::Console::Render::XtermEngine::EndPaint+0x94 [onecore\windows\core\console\open\src\renderer\vt\xtermengine.cpp @ 103] 
05 (Inline Function) --------`-------- conhost!Microsoft::Console::Render::Renderer::_PaintFrameForEngine::__l2::<lambda_52050c757ade60eff63cb1da4741b009>::operator()+0x11 [onecore\windows\core\console\open\src\renderer\base\renderer.cpp @ 158] 
06 (Inline Function) --------`-------- conhost!wil::details::lambda_call<<lambda_52050c757ade60eff63cb1da4741b009> >::reset+0x18 [internal\sdk\inc\wil\opensource\wil\resource.h @ 431] 
07 000000c2`af37fa80 00007ff6`1c554fcc conhost!Microsoft::Console::Render::Renderer::_PaintFrameForEngine+0x402 [onecore\windows\core\console\open\src\renderer\base\renderer.cpp @ 186] 
08 000000c2`af37fb90 00007ff6`1c53dcb9 conhost!Microsoft::Console::Render::Renderer::TriggerTeardown+0x17d7c [onecore\windows\core\console\open\src\renderer\base\renderer.cpp @ 322] 
09 000000c2`af37fbc0 00007ff6`1c567499 conhost!Microsoft::Console::Interactivity::ServiceLocator::RundownAndExit+0x21 [onecore\windows\core\console\open\src\interactivity\base\servicelocator.cpp @ 61] 
0a (Inline Function) --------`-------- conhost!Microsoft::Console::PtySignalInputThread::_Shutdown+0x14 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 218] 
0b 000000c2`af37fbf0 00007ff6`1c5674da conhost!Microsoft::Console::PtySignalInputThread::_GetData+0x59 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 162] 
0c 000000c2`af37fc30 00007ffd`46347bd4 conhost!Microsoft::Console::PtySignalInputThread::_InputThread+0x2e [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 86] 
0d 000000c2`af37fc80 00007ffd`4658ce71 kernel32!BaseThreadInitThunk+0x14 [base\win32\client\thread.c @ 64] 
0e 000000c2`af37fcb0 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] 

I intend to correct this one too. What happens for this one is:

  1. The signal pipe is broken by the ClosePseudoConsole call
  2. It sends the process shutdown commands by dispatching a CTRL_CLOSE_EVENT to the client application
  3. The client application shuts down
  4. A console disconnect message comes in by the IO channel
  5. The PTY is configured to draw ONE LAST FRAME before leaving if the PTY output is still open so apps that open and close quickly (like running ipconfig from WSL interop) can get the entire frame before the PTY session disappears.

If the terminal application isn't listening in a multithreaded fashion to drain that output, it will hang indefinitely.

@miniksa commented on GitHub (Aug 23, 2019): Here is a second interlock that can occur later in the shutdown process: ``` . 0 Id: 29ec.45ac Suspend: 0 Teb: 000000c2`af097000 Unfrozen # Child-SP RetAddr Call Site 00 000000c2`af2ff9c8 00007ffd`46527619 ntdll!ZwWaitForAlertByThreadId+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 3843] 01 000000c2`af2ff9d0 00007ffd`465274d2 ntdll!RtlpWaitOnAddressWithTimeout+0x81 [minkernel\ntos\rtl\waitaddr.c @ 851] 02 000000c2`af2ffa00 00007ffd`465272fd ntdll!RtlpWaitOnAddress+0xae [minkernel\ntos\rtl\waitaddr.c @ 1094] 03 000000c2`af2ffa70 00007ffd`4653b586 ntdll!RtlpWaitOnCriticalSection+0xfd [minkernel\ntos\rtl\resource.c @ 1626] 04 000000c2`af2ffb50 00007ffd`4653b3d0 ntdll!RtlpEnterCriticalSectionContended+0x1a6 [minkernel\ntos\rtl\resource.c @ 2333] 05 000000c2`af2ffbb0 00007ff6`1c528d3e ntdll!RtlEnterCriticalSection+0x40 [minkernel\ntos\rtl\resource.c @ 1939] 06 (Inline Function) --------`-------- conhost!CONSOLE_INFORMATION::LockConsole+0xd [onecore\windows\core\console\open\src\host\consoleinformation.cpp @ 62] 07 (Inline Function) --------`-------- conhost!LockConsole+0xd [onecore\windows\core\console\open\src\host\handle.cpp @ 17] 08 (Inline Function) --------`-------- conhost!RemoveConsole+0xd [onecore\windows\core\console\open\src\host\srvinit.cpp @ 198] 09 000000c2`af2ffbe0 00007ff6`1c52f2d4 conhost!IoDispatchers::ConsoleClientDisconnectRoutine+0x46 [onecore\windows\core\console\open\src\server\iodispatchers.cpp @ 280] 0a (Inline Function) --------`-------- conhost!IoSorter::ServiceIoOperation+0x14d [onecore\windows\core\console\open\src\server\iosorter.cpp @ 44] 0b 000000c2`af2ffc10 00007ffd`46347bd4 conhost!ConsoleIoThread+0x224 [onecore\windows\core\console\open\src\host\srvinit.cpp @ 670] 0c 000000c2`af2ffd90 00007ffd`4658ce71 kernel32!BaseThreadInitThunk+0x14 [base\win32\client\thread.c @ 64] 0d 000000c2`af2ffdc0 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] 1 Id: 29ec.6954 Suspend: 0 Teb: 000000c2`af099000 Unfrozen # Child-SP RetAddr Call Site 00 000000c2`af37f8e8 00007ffd`43ea50aa ntdll!ZwWriteFile+0x14 [minkernel\ntdll\daytona\objfre\amd64\usrstubs.asm @ 243] 01 000000c2`af37f8f0 00007ff6`1c59138c KERNELBASE!WriteFile+0x7a [minkernel\kernelbase\filehops.c @ 1558] 02 000000c2`af37f960 00007ff6`1c594abc conhost!Microsoft::Console::Render::VtEngine::_Flush+0x4c [onecore\windows\core\console\open\src\renderer\vt\state.cpp @ 112] 03 000000c2`af37f9b0 00007ff6`1c591c14 conhost!Microsoft::Console::Render::VtEngine::EndPaint+0xbc [onecore\windows\core\console\open\src\renderer\vt\paint.cpp @ 78] 04 000000c2`af37fa20 00007ff6`1c530732 conhost!Microsoft::Console::Render::XtermEngine::EndPaint+0x94 [onecore\windows\core\console\open\src\renderer\vt\xtermengine.cpp @ 103] 05 (Inline Function) --------`-------- conhost!Microsoft::Console::Render::Renderer::_PaintFrameForEngine::__l2::<lambda_52050c757ade60eff63cb1da4741b009>::operator()+0x11 [onecore\windows\core\console\open\src\renderer\base\renderer.cpp @ 158] 06 (Inline Function) --------`-------- conhost!wil::details::lambda_call<<lambda_52050c757ade60eff63cb1da4741b009> >::reset+0x18 [internal\sdk\inc\wil\opensource\wil\resource.h @ 431] 07 000000c2`af37fa80 00007ff6`1c554fcc conhost!Microsoft::Console::Render::Renderer::_PaintFrameForEngine+0x402 [onecore\windows\core\console\open\src\renderer\base\renderer.cpp @ 186] 08 000000c2`af37fb90 00007ff6`1c53dcb9 conhost!Microsoft::Console::Render::Renderer::TriggerTeardown+0x17d7c [onecore\windows\core\console\open\src\renderer\base\renderer.cpp @ 322] 09 000000c2`af37fbc0 00007ff6`1c567499 conhost!Microsoft::Console::Interactivity::ServiceLocator::RundownAndExit+0x21 [onecore\windows\core\console\open\src\interactivity\base\servicelocator.cpp @ 61] 0a (Inline Function) --------`-------- conhost!Microsoft::Console::PtySignalInputThread::_Shutdown+0x14 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 218] 0b 000000c2`af37fbf0 00007ff6`1c5674da conhost!Microsoft::Console::PtySignalInputThread::_GetData+0x59 [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 162] 0c 000000c2`af37fc30 00007ffd`46347bd4 conhost!Microsoft::Console::PtySignalInputThread::_InputThread+0x2e [onecore\windows\core\console\open\src\host\ptysignalinputthread.cpp @ 86] 0d 000000c2`af37fc80 00007ffd`4658ce71 kernel32!BaseThreadInitThunk+0x14 [base\win32\client\thread.c @ 64] 0e 000000c2`af37fcb0 00000000`00000000 ntdll!RtlUserThreadStart+0x21 [minkernel\ntdll\rtlstrt.c @ 1153] ``` I intend to correct this one too. What happens for this one is: 1. The signal pipe is broken by the ClosePseudoConsole call 2. It sends the process shutdown commands by dispatching a CTRL_CLOSE_EVENT to the client application 3. The client application shuts down 4. A console disconnect message comes in by the IO channel 5. The PTY is configured to draw _ONE LAST FRAME_ before leaving if the PTY output is still open so apps that open and close quickly (like running ipconfig from WSL interop) can get the entire frame before the PTY session disappears. If the terminal application isn't listening in a multithreaded fashion to drain that output, it will hang indefinitely.
Author
Owner

@miniksa commented on GitHub (Aug 23, 2019):

On the plus side, I'm making progress.

On the minus side, for every race condition I resolve here, I find some more.

@miniksa commented on GitHub (Aug 23, 2019): On the plus side, I'm making progress. On the minus side, for every race condition I resolve here, I find some more.
Author
Owner

@miniksa commented on GitHub (Aug 26, 2019):

OK. I fixed as many races and locks as I could. @sbatten, @Tyriar

There will still be a situation where you close the pseudoconsole too fast that I cannot resolve.

Since you own control of starting the child process, you can technically start it up with an invalid handle or against a pseudoconsole that has already been closed. That is, you could start the pseudoconsole and close it while the client application is already started.

It will throw an error on its initialization when that happens and probably show a pop-up dialog with an error 0xc0000142 as "DLL initialization failed".

This would not be a hang. I believe I fixed all the hangs I could find.

To alleviate the other hangs from your side in the mean time, we highly recommend that each handle goes onto its own thread and is drained/filled to/from buffers on your side. We didn't originally envision this as an API consumed by a single blocked thread on the other side. I intend to update the documentation clarifying this point.

The two specific hangs from the dumps are resolvable from the client side by:

  • Draining the output pipe in another thread inside VSCode. If the buffer contains a cursor position query message, respond to it on the input pipe even if you're already shutting down the pseudoconsole on another thread.
  • Even if you don't use the inherit-cursor message, drain the output pipe on another thread anyway into a local buffer. As the pseudoconsole goes down, it tries to emit one final layout/paint frame into the pipe buffer. If that's not drained, it hangs until it is drained.
@miniksa commented on GitHub (Aug 26, 2019): OK. I fixed as many races and locks as I could. @sbatten, @Tyriar There will still be a situation where you close the pseudoconsole too fast that I cannot resolve. Since you own control of starting the child process, you can technically start it up with an invalid handle or against a pseudoconsole that has already been closed. That is, you could start the pseudoconsole and close it while the client application is already started. It will throw an error on its initialization when that happens and probably show a pop-up dialog with an error 0xc0000142 as "DLL initialization failed". This would not be a hang. I believe I fixed all the hangs I could find. To alleviate the other hangs from your side in the mean time, we highly recommend that each handle goes onto its own thread and is drained/filled to/from buffers on your side. We didn't originally envision this as an API consumed by a single blocked thread on the other side. I intend to update the documentation clarifying this point. The two specific hangs from the dumps are resolvable from the client side by: - Draining the output pipe in another thread inside VSCode. If the buffer contains a cursor position query message, respond to it on the input pipe even if you're already shutting down the pseudoconsole on another thread. - Even if you don't use the inherit-cursor message, drain the output pipe on another thread anyway into a local buffer. As the pseudoconsole goes down, it tries to emit one final layout/paint frame into the pipe buffer. If that's not drained, it hangs until it is drained.
Author
Owner

@miniksa commented on GitHub (Nov 8, 2019):

This caused many, many issues when imported to Windows so it had to be reverted. Reactivating as we'll have to refine it further going forward.

@miniksa commented on GitHub (Nov 8, 2019): This caused many, many issues when imported to Windows so it had to be reverted. Reactivating as we'll have to refine it further going forward.
Author
Owner

@JustinGrote commented on GitHub (Dec 20, 2019):

@miniksa hopefully it can get sorted, it currently causes all kinds of hard locks in Visual Studio Code at the moment in any terminal. To reproduce, just type 'a' and kill the process.
Capture

Quite frustrating :/

@JustinGrote commented on GitHub (Dec 20, 2019): @miniksa hopefully it can get sorted, it currently causes all kinds of hard locks in Visual Studio Code at the moment in any terminal. To reproduce, just type 'a' and kill the process. ![Capture](https://user-images.githubusercontent.com/15258962/71267430-5b2e7e80-22ff-11ea-8d36-c1b0bcfba51d.gif) Quite frustrating :/
Author
Owner

@miniksa commented on GitHub (Jan 2, 2020):

@JustinGrote, to be 100% clear, VS Code can fix this on the client end by properly threading the handles received from the ConPTY. The hang is because all of the ConPTY handles are being serviced on the same thread in VS Code. To resolve this condition, the output and input handles should be managed on separate threads/queues than the one that is managing the session.

My attempts at fixing this were around making it so that everything could be processed on one thread because I hate giving a customer (in this case VS Code) the answer of "you're holding it wrong." But I was unsuccessful and broke WSL at the same time.

Unfortunately as the console host and WSL are inbox components that ship in the same OS release, I had to bias toward making sure they worked well together. VS Code is not an inbox component and can update frequently, so perhaps it can be fixed out of band before I can come up with a better solution that allows a single-threaded client to not hang AND preserves the final-flush operation that WSL depends on for Win32 interop.

@miniksa commented on GitHub (Jan 2, 2020): @JustinGrote, to be 100% clear, VS Code can fix this on the client end by properly threading the handles received from the ConPTY. The hang is because all of the ConPTY handles are being serviced on the same thread in VS Code. To resolve this condition, the output and input handles should be managed on separate threads/queues than the one that is managing the session. My attempts at fixing this were around making it so that everything *could* be processed on one thread because I hate giving a customer (in this case VS Code) the answer of "you're holding it wrong." But I was unsuccessful and broke WSL at the same time. Unfortunately as the console host and WSL are inbox components that ship in the same OS release, I had to bias toward making sure they worked well together. VS Code is not an inbox component and can update frequently, so perhaps it can be fixed out of band before I can come up with a better solution that allows a single-threaded client to not hang AND preserves the final-flush operation that WSL depends on for Win32 interop.
Author
Owner

@JustinGrote commented on GitHub (Jan 2, 2020):

@miniksa understood, "to ship is to choose" :). I'll see if I can find a related vscode open issue and reference this here, otherwise I'll open a net-new one.

@JustinGrote commented on GitHub (Jan 2, 2020): @miniksa understood, "to ship is to choose" :). I'll see if I can find a related vscode open issue and reference this here, otherwise I'll open a net-new one.
Author
Owner

@Tyriar commented on GitHub (Jan 2, 2020):

@miniksa help would be appreciated here. I'm not a C++ dev anymore and don't have time to invest in researching how to do it the right way and figuring out all the ins and outs that come with managing the different threads. Additionally, the initial (broken?) conpty implementation in node-pty was primarily done by your team https://github.com/microsoft/node-pty/issues/375

@Tyriar commented on GitHub (Jan 2, 2020): @miniksa help would be appreciated here. I'm not a C++ dev anymore and don't have time to invest in researching how to do it the right way and figuring out all the ins and outs that come with managing the different threads. Additionally, the initial (broken?) conpty implementation in node-pty was primarily done by your team https://github.com/microsoft/node-pty/issues/375
Author
Owner

@miniksa commented on GitHub (Jan 2, 2020):

OK. We'll see. Thanks for the pointer to the code. We'll have to see where/when we can find resources for that and for this issue.

@miniksa commented on GitHub (Jan 2, 2020): OK. We'll see. Thanks for the pointer to the code. We'll have to see where/when we can find resources for that and for this issue.
Author
Owner

@allforabit commented on GitHub (Mar 11, 2020):

This is happening to me as well. It consistently hangs when I restart a terminal process. My workaround is the use the git bash terminal.

@allforabit commented on GitHub (Mar 11, 2020): This is happening to me as well. It consistently hangs when I restart a terminal process. My workaround is the use the git bash terminal.
Author
Owner

@DHowett-MSFT commented on GitHub (Mar 11, 2020):

@allforabit I’m willing to bet what you’re describing is different. How are you restarting the terminal process? We haven’t seen any reports of hanging-on-close in a while.

@DHowett-MSFT commented on GitHub (Mar 11, 2020): @allforabit I’m willing to bet what you’re describing is different. How are you restarting the terminal process? We haven’t seen any reports of hanging-on-close in a while.
Author
Owner

@allforabit commented on GitHub (Mar 11, 2020):

Hi @DHowett-MSFT it's when I restart a task by running the "Tasks: Restart running task". Apologies for the vague description in the comment. Thanks!

@allforabit commented on GitHub (Mar 11, 2020): Hi @DHowett-MSFT it's when I restart a task by running the "Tasks: Restart running task". Apologies for the vague description in the comment. Thanks!
Author
Owner

@DHowett-MSFT commented on GitHub (Mar 11, 2020):

Okay, so you're using some tool to murder the terminal that's hosting all the pseudoconsoles. This is not a case of the "ClosePseudoConsole" API hanging, because when you murder the terminal that API is not getting called at all.

@DHowett-MSFT commented on GitHub (Mar 11, 2020): Okay, so you're using some tool to murder the terminal that's hosting all the pseudoconsoles. This is _not_ a case of the "ClosePseudoConsole" API hanging, because when you murder the terminal that API is not getting called at all.
Author
Owner

@allforabit commented on GitHub (Mar 11, 2020):

Ok I clicked this link from a Vs code link and I thought this issue was in the VS code issue queue. Pease disregard my comment. I will file an issue there.

@allforabit commented on GitHub (Mar 11, 2020): Ok I clicked this link from a Vs code link and I thought this issue was in the VS code issue queue. Pease disregard my comment. I will file an issue there.
Author
Owner

@DHowett-MSFT commented on GitHub (Mar 11, 2020):

Oh, I see. This is VS code. Okay. Yes, this may be the same issue.

@DHowett-MSFT commented on GitHub (Mar 11, 2020): Oh, I see. This _is_ VS code. Okay. Yes, this may be the same issue.
Author
Owner

@allforabit commented on GitHub (Mar 11, 2020):

Let me you need more info. It very consistently crashes for me so I can do a crash dump or whatever would help.

@allforabit commented on GitHub (Mar 11, 2020): Let me you need more info. It very consistently crashes for me so I can do a crash dump or whatever would help.
Author
Owner

@joshfria commented on GitHub (Apr 20, 2020):

Is there any movement on this issue? It causes hangs for many vscode users.

@joshfria commented on GitHub (Apr 20, 2020): Is there any movement on this issue? It causes hangs for many vscode users.
Author
Owner

@zadjii-msft commented on GitHub (Apr 20, 2020):

@joshfria To summarize the linked PRs:

  • We tried to fix this in #2525, however, that caused a number of issues with inbox products like WSL
  • We reverted those changes in #3488

Since then, we haven't had an opportunity to return to this bug. Unfortunately, it's a much harder problem than many of the other bugs on this repo, so it'll take some time to make sure we can fix it without breaking the rest of the world.

@zadjii-msft commented on GitHub (Apr 20, 2020): @joshfria To summarize the linked PRs: * We tried to fix this in #2525, however, that caused a number of issues with inbox products like WSL * We reverted those changes in #3488 Since then, we haven't had an opportunity to return to this bug. Unfortunately, it's a much harder problem than many of the other bugs on this repo, so it'll take some time to make sure we can fix it without breaking the rest of the world.
Author
Owner

@skapil-ind commented on GitHub (May 12, 2020):

I'm facing the same issue as well. I open the terminal through " Ctrl + ` ". Type cls or some other command few times, then i kill the terminal and bang, VS Code hangs up. Then comes a windows of keep waiting or reopen. Any progress on the issue??

@skapil-ind commented on GitHub (May 12, 2020): I'm facing the same issue as well. I open the terminal through " Ctrl + ` ". Type cls or some other command few times, then i kill the terminal and bang, VS Code hangs up. Then comes a windows of keep waiting or reopen. Any progress on the issue??
Author
Owner

@zadjii-msft commented on GitHub (May 12, 2020):

@sukritkapil2 Nope, when there's progress here, we'll make sure to post in this thread 😜

@zadjii-msft commented on GitHub (May 12, 2020): @sukritkapil2 Nope, when there's progress here, we'll make sure to post in this thread 😜
Author
Owner

@JustinGrote commented on GitHub (May 12, 2020):

@zadjii-msft out of curiosity, is there any way to tweak how long vscode hangs before prompting for a reload? Right now it's like 15 seconds or something, if I could get this down to 3 or so this would be less annoying.

@JustinGrote commented on GitHub (May 12, 2020): @zadjii-msft out of curiosity, is there any way to tweak how long vscode hangs before prompting for a reload? Right now it's like 15 seconds or something, if I could get this down to 3 or so this would be less annoying.
Author
Owner

@zadjii-msft commented on GitHub (May 12, 2020):

@JustinGrote Honestly I know next to nothing about VsCode, so I'm not gonna be super helpful with that question 😕 I will however summon my go-to vscode expert, @Tyriar, to see if he's got any ideas.

@zadjii-msft commented on GitHub (May 12, 2020): @JustinGrote Honestly I know next to nothing about VsCode, so I'm not gonna be super helpful with that question 😕 I will however summon my go-to vscode expert, @Tyriar, to see if he's got any ideas.
Author
Owner

@JustinGrote commented on GitHub (May 12, 2020):

@zadjii-msft whoops, I forgot which context I was in since this issue is spread across multiple repositories :)

@JustinGrote commented on GitHub (May 12, 2020): @zadjii-msft whoops, I forgot which context I was in since this issue is spread across multiple repositories :)
Author
Owner

@DanalEstes commented on GitHub (May 21, 2020):

Various issue records indicate this has been known for just under a year. May I respectfully request the priority of this be raised? It is a strong dis-incentive to migrate to VScode.

As a data point regarding priority: It reproduces very consistently when volume of print coming into terminal window is high. "High" meaning anything more that a print every 1/10 of a second.

When terminal is killed, all UI responsiveness in VScode stops three to five seconds later. "Reload" dialog pops up an eternity (OK, about 30 seconds) after that. Reload succeeds.

Version: 1.45.1 (user setup)
Commit: 5763d909d5f12fe19f215cbfdd29a91c0fa9208a
Date: 2020-05-14T08:27:35.169Z
Electron: 7.2.4
Chrome: 78.0.3904.130
Node.js: 12.8.1
V8: 7.8.279.23-electron.0
OS: Windows_NT x64 10.0.18362

@DanalEstes commented on GitHub (May 21, 2020): **Various issue records indicate this has been known for just under a year. May I respectfully request the priority of this be raised? It is a strong dis-incentive to migrate to VScode.** As a data point regarding priority: It reproduces *very consistently* when volume of print coming into terminal window is high. "High" meaning anything more that a print every 1/10 of a second. When terminal is killed, all UI responsiveness in VScode stops three to five seconds later. "Reload" dialog pops up an eternity (OK, about 30 seconds) after that. Reload succeeds. Version: 1.45.1 (user setup) Commit: 5763d909d5f12fe19f215cbfdd29a91c0fa9208a Date: 2020-05-14T08:27:35.169Z Electron: 7.2.4 Chrome: 78.0.3904.130 Node.js: 12.8.1 V8: 7.8.279.23-electron.0 OS: Windows_NT x64 10.0.18362
Author
Owner

@DHowett commented on GitHub (May 21, 2020):

Hey @DanalEstes, sorry about that. The four engineers who work here on this team have been pretty busy on Terminal since this issue was filed. We want to make sure we can give this issue the space it deserves so that we don't end up with a solution that breaks everyone else (#2525).

Interprocess communication over three handles (really five) doesn't seem like it should be terribly difficult, but it turns out that there's a lot of intricacies there.

@DHowett commented on GitHub (May 21, 2020): Hey @DanalEstes, sorry about that. The four engineers who work here on this team have been pretty busy on Terminal since this issue was filed. We want to make sure we can give this issue the space it deserves so that we don't end up with a solution that breaks everyone else (#2525). Interprocess communication over three handles (really five) doesn't seem like it should be terribly difficult, but it turns out that there's a lot of intricacies there.
Author
Owner

@christianparpart commented on GitHub (Jun 18, 2020):

Hey, I am developing my own terminal emulator and I am using ConPTY on Windows platform and I am also suffering from this bug. the ReadFile() into the ConPTY handle is just hanging forever as soon as the client-side has terminated.

Is there any best-practice workaround for that, until a proper fix will be there? I personally think such a bug is actually a blocker and should be handled ASAP, since you literally can't properly close the window.

@christianparpart commented on GitHub (Jun 18, 2020): Hey, I am developing my own terminal emulator and I am using ConPTY on Windows platform and I am also suffering from this bug. the ReadFile() into the ConPTY handle is just hanging forever as soon as the client-side has terminated. Is there any best-practice workaround for that, until a proper fix will be there? I personally think such a bug is actually a blocker and should be handled ASAP, since you literally can't properly close the window.
Author
Owner

@zadjii-msft commented on GitHub (Jun 19, 2020):

As mentioned earlier in this thread:

to be 100% clear, VS Code can fix this on the client end by properly threading the handles received from the ConPTY. The hang is because all of the ConPTY handles are being serviced on the same thread in VS Code. To resolve this condition, the output and input handles should be managed on separate threads/queues than the one that is managing the session.

@christianparpart could you do something like that?

@zadjii-msft commented on GitHub (Jun 19, 2020): As mentioned [earlier in this thread](https://github.com/microsoft/terminal/issues/1810#issuecomment-570284546): > to be 100% clear, VS Code can fix this on the client end by properly threading the handles received from the ConPTY. The hang is because all of the ConPTY handles are being serviced on the same thread in VS Code. To resolve this condition, the output and input handles should be managed on separate threads/queues than the one that is managing the session. @christianparpart could you do something like that?
Author
Owner

@JustinGrote commented on GitHub (Jun 19, 2020):

@christianparpart a PR to fix it is in progress:
https://github.com/microsoft/node-pty/issues/375

@JustinGrote commented on GitHub (Jun 19, 2020): @christianparpart a PR to fix it is in progress: https://github.com/microsoft/node-pty/issues/375
Author
Owner

@christianparpart commented on GitHub (Jun 19, 2020):

Hey @zadjii-msft, thanks for the mention. I am actually writing to the ConPTY pipe from the main thread whereas the reading part is done in another background thread.

I temporarily worked around it now by also watching on the connected process to get notified when the process died. I can add some EDIT here once I'm back home, with more details.

Also, thx for the ref @JustinGrote .

@christianparpart commented on GitHub (Jun 19, 2020): Hey @zadjii-msft, thanks for the mention. I am actually writing to the ConPTY pipe from the main thread whereas the reading part is done in another background thread. I temporarily worked around it now by also watching on the connected process to get notified when the process died. I can add some EDIT here once I'm back home, with more details. Also, thx for the ref @JustinGrote .
Author
Owner

@Tyriar commented on GitHub (Jul 8, 2020):

Note that I've made a PR that fixes this in node-pty https://github.com/microsoft/node-pty/pull/415, however it does not yet work in Electron 8 (haven't tried 9) when node integration is enabled as worker_threads doesn't seem to work https://github.com/electron/electron/issues/18540.

@Tyriar commented on GitHub (Jul 8, 2020): Note that I've made a PR that fixes this in node-pty https://github.com/microsoft/node-pty/pull/415, however it does not yet work in Electron 8 (haven't tried 9) when node integration is enabled as `worker_threads` doesn't seem to work https://github.com/electron/electron/issues/18540.
Author
Owner

@Tyriar commented on GitHub (Sep 25, 2020):

VS Code update: Electron isn't going to support worker_threads in the renderer thread, so the fix is blocked on moving where we host node-pty/conpty to a real node process: https://github.com/microsoft/node-pty/pull/415#issuecomment-688870219

@Tyriar commented on GitHub (Sep 25, 2020): VS Code update: Electron isn't going to support `worker_threads` in the renderer thread, so the fix is blocked on moving where we host node-pty/conpty to a real node process: https://github.com/microsoft/node-pty/pull/415#issuecomment-688870219
Author
Owner

@christianparpart commented on GitHub (Sep 29, 2020):

As mentioned earlier in this thread:

to be 100% clear, VS Code can fix this on the client end by properly threading the handles received from the ConPTY. The hang is because all of the ConPTY handles are being serviced on the same thread in VS Code. To resolve this condition, the output and input handles should be managed on separate threads/queues than the one that is managing the session.

@christianparpart could you do something like that?

I am currently actively looking into finally working around this ConPTY behavior.

Honestly, if your above quoted workaround is the only way to work around this, I'd rather drop Windows support (sadly) or just wait until maybe or maybe not eventually ConPTY can handle with read/write operations in one thread. The reasoning is not that I want to be stubborn but rather that ConPTY (windows) is the only platform that requires this architectural change just to not freeze at shutdown.

I'm trying it though, but it seems like not worth it for now. :)

UPDATE: Okay, I've actually forgotten my own source code, and I remember why I did already split up PTY reads from writes (because ConPTY does not support non-blocking I/O).

My problem was, that if the connected process behind ConPTY terminated, the already started ConPTY's read() call did not return and does hang forever. This is still the case with latest software updates, and I actually do consider this a bad (or buggy?) behaviour compared to the other PTY implementations cancelling any active I/O operation when the other end disconnects.
I did add a process exit watcher that does explicitly close the terminal's end of the PTY, and now the read-call is not hanging anymore.

Thanks :)

@christianparpart commented on GitHub (Sep 29, 2020): > As mentioned [earlier in this thread](https://github.com/microsoft/terminal/issues/1810#issuecomment-570284546): > > > to be 100% clear, VS Code can fix this on the client end by properly threading the handles received from the ConPTY. The hang is because all of the ConPTY handles are being serviced on the same thread in VS Code. To resolve this condition, the output and input handles should be managed on separate threads/queues than the one that is managing the session. > > @christianparpart could you do something like that? I am currently actively looking into finally working around this ConPTY behavior. Honestly, if your above quoted workaround is the *only* way to work around this, I'd rather drop Windows support (sadly) or just wait until maybe or maybe not eventually ConPTY can handle with read/write operations in one thread. The reasoning is not that I want to be stubborn but rather that ConPTY (windows) is the _only_ platform that requires this architectural change just to not freeze at shutdown. I'm trying it though, but it seems like not worth it for now. :) UPDATE: Okay, I've actually forgotten my own source code, and I remember why I did already split up PTY reads from writes (because ConPTY does not support non-blocking I/O). My problem was, that if the connected process behind ConPTY terminated, the already started ConPTY's `read()` call did not return and does hang forever. This is still the case with latest software updates, and I actually do consider this a bad (or buggy?) behaviour compared to the other PTY implementations cancelling any active I/O operation when the other end disconnects. I did add a process exit watcher that does explicitly close the terminal's end of the PTY, and now the `read`-call is not hanging anymore. Thanks :)
Author
Owner

@JustinGrote commented on GitHub (Feb 9, 2021):

@miniksa @Tyriar has merged https://github.com/microsoft/node-pty/pull/415#issuecomment-688870219 into Node-pty. Can you provide an update when this might make it upstream to VSCode?

@JustinGrote commented on GitHub (Feb 9, 2021): @miniksa @Tyriar has merged https://github.com/microsoft/node-pty/pull/415#issuecomment-688870219 into Node-pty. Can you provide an update when this might make it upstream to VSCode?
Author
Owner

@Tyriar commented on GitHub (Feb 10, 2021):

Probably tomorrow's insiders, follow https://github.com/microsoft/vscode/pull/116185 for updates

@Tyriar commented on GitHub (Feb 10, 2021): Probably tomorrow's insiders, follow https://github.com/microsoft/vscode/pull/116185 for updates
Author
Owner

@TheLizzard commented on GitHub (Apr 5, 2021):

Any updates? I am also having the same bug in my code. When I call ClosePseudoConsole my program hangs. It only happens when I call CreatePseudoConsole with dwFlags=1. Btw my project is written in python using the windows API.

@TheLizzard commented on GitHub (Apr 5, 2021): Any updates? I am also having the same bug in my code. When I call `ClosePseudoConsole` my program hangs. It only happens when I call `CreatePseudoConsole` with `dwFlags=1`. Btw my project is written in python using the windows API.
Author
Owner

@Anduh commented on GitHub (May 9, 2022):

https://github.com/microsoft/vscode/issues/144016 is likely related

@Anduh commented on GitHub (May 9, 2022): https://github.com/microsoft/vscode/issues/144016 is likely related
Author
Owner

@zadjii-msft commented on GitHub (Dec 5, 2022):

We are pretty sure that #14160 fixed this?

@zadjii-msft commented on GitHub (Dec 5, 2022): We are pretty sure that #14160 fixed this?
Author
Owner

@Tyriar commented on GitHub (Dec 6, 2022):

Nice! Sounds like it.

@Tyriar commented on GitHub (Dec 6, 2022): Nice! Sounds like it.
Author
Owner

@lhecker commented on GitHub (Dec 6, 2022):

There's still an easily reproducible bug in WSL however. It calls ClosePseudoConsole on the same thread that's reading from the output pipe and so it can deadlock (conhost is blocked writing to the pipe on exit, WSL is blocked waiting for conhost to exit). I'm planning to contact them and introduce a new ClosePseudoConsoleTimeout function to simplify the process (a timeout of 0 will equate to a fully async shutdown).

The bug reproduces if you execute any CLI application that writes more than 4kB of text before conhost has a chance to exit. This happens with applications similar to cat in particular and is a race condition in most situations. The deadlock is resolved once you run wsl --shutdown.

@lhecker commented on GitHub (Dec 6, 2022): There's still an easily reproducible bug in WSL however. It calls `ClosePseudoConsole` on the same thread that's reading from the output pipe and so it can deadlock (conhost is blocked writing to the pipe on exit, WSL is blocked waiting for conhost to exit). I'm planning to contact them and introduce a new `ClosePseudoConsoleTimeout` function to simplify the process (a timeout of 0 will equate to a fully async shutdown). The bug reproduces if you execute any CLI application that writes more than 4kB of text before conhost has a chance to exit. This happens with applications similar to `cat` in particular and is a race condition in most situations. The deadlock is resolved once you run `wsl --shutdown`.
Author
Owner

@zadjii-msft commented on GitHub (Jan 24, 2023):

I'm gonna close this out, because we think this was fixed in 1.17. It'll obviously be a while before we can get a newer version ingested into VsCode (cause we basically need to do... #6999? That doesn't look right but it's close), but the code on our end should be finished.

If this still repros with the 1.17+ version of the API, then we'll probably need something that builds on that fix, rather than revert that and try something new.

@zadjii-msft commented on GitHub (Jan 24, 2023): I'm gonna close this out, because we think this was fixed in 1.17. It'll obviously be a while before we can get a newer version ingested into VsCode (cause we basically need to do... #6999? That doesn't look right but it's close), but the code on our end should be finished. If this still repros with the 1.17+ version of the API, then we'll probably need something that builds on that fix, rather than revert that and try something new.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#2535