Feature Request: Focus/Activate Tab by WT_SESSION #24002

Closed
opened 2026-01-31 08:58:44 +00:00 by claunia · 3 comments
Owner

Originally created by @shanselman on GitHub (Jan 25, 2026).

Feature Request: Focus/Activate Tab by WT_SESSION

The Problem

There's no way for an external process to programmatically switch to a specific Windows Terminal tab.

When a process running inside a terminal tab needs to bring the user back to that exact tab (not just the window), it's currently impossible. The wt.exe CLI can open new tabs and target windows, but cannot activate an existing tab by its session identifier.

Why This Matters Now: The AI Agent Era

2024-2025 has seen an explosion of AI coding agents that run in the terminal:

Agent Vendor Typical Task Duration
Claude Code Anthropic 30 sec - 10+ min
GitHub Copilot CLI GitHub/Microsoft 10 sec - 5 min
Gemini CLI Google 10 sec - 5 min
Aider Open Source 30 sec - 10 min
Cursor Anysphere Variable

These agents run autonomously—refactoring code, running tests, fixing bugs across multiple files. Users naturally alt-tab away while waiting. When the agent finishes:

  1. A notification appears ("Claude finished refactoring")
  2. User clicks the notification
  3. Windows Terminal comes to foreground...
  4. But the user is on the wrong tab and has to hunt for where the agent was running

With 5-10 tabs open across different projects, this is a real friction point.

Prior Art: Every Other Major Terminal Has This

tmux (Gold Standard)

# Direct tab/pane targeting by session ID
tmux select-window -t mysession:2
tmux select-pane -t mysession:2.3

# Query all sessions, windows, panes
tmux list-sessions
tmux list-windows -t <session>
tmux list-panes -t <session:window>

iTerm2 (macOS)

# Full Python API for programmatic control
import iterm2

async def main(connection):
    app = await iterm2.async_get_app(connection)
    window = app.current_terminal_window
    target_tab = window.tabs[2]
    await window.async_select_tab(target_tab)  # ← This is what we need

iterm2.run_until_complete(main)

Also supports AppleScript:

tell application "iTerm2"
    tell current window
        select tab 2
    end tell
end tell

Kitty (Linux/macOS)

# IPC via Unix socket - can focus tabs by index or match criteria
kitty @ focus-tab --match index:2
kitty @ focus-window --match title:myproject

# Query state
kitty @ ls  # Returns JSON of all windows, tabs, panes

Windows Terminal: The Gap

# We can do this:
wt.exe -w 0 nt                    # New tab in MRU window
wt.exe -w 2 split-pane            # Split in window 2

# We cannot do this:
wt.exe --focus-session $env:WT_SESSION   # ❌ Doesn't exist
wt.exe -w 0 focus-tab --index 3          # ❌ Doesn't exist
wt.exe --query-sessions                   # ❌ Doesn't exist

Technical Context

What Already Exists (The Building Blocks)

  1. WT_SESSION environment variable - Already set per tab/pane, unique GUID like 5720ee6d-6474-47b0-88db-fa7e10e60d37

  2. WT_PROFILE_ID environment variable - Profile GUID, useful but not unique per tab

  3. Window targeting via -w - Can target windows by ID or name, just not tabs within them

  4. Shell integration OSC sequences - Terminal already has rich bidirectional communication:

    OSC 133;A ST  - Start of prompt
    OSC 133;B ST  - Start of command input  
    OSC 133;C ST  - Command executed
    OSC 133;D;N ST - Command finished with exit code N
    OSC 9;9;CWD ST - Current working directory
    
  5. Internal tab management - Terminal obviously tracks tabs internally; just not exposed externally

Current Workarounds (All Fragile)

UI Automation - Enumerate TabItem elements, match by title, call SelectionItemPattern.Select():

// This works but is fragile
CComPtr<IUIAutomationElement> tabItem;
automation->CreatePropertyCondition(UIA_NamePropertyId, tabTitle, &condition);
terminalElement->FindFirst(TreeScope_Descendants, condition, &tabItem);
// Problem: Tab titles change, can be duplicated, user can rename

Keyboard simulation - Send Ctrl+Tab repeatedly:

// Extremely fragile, timing-dependent
keybd_event(VK_CONTROL, 0, 0, 0);
keybd_event(VK_TAB, 0, 0, 0);
// Problem: No way to know when you've reached the right tab

Title matching via console API:

wchar_t title[1024];
GetConsoleTitleW(title, 1024);  
// Problem: Title ≠ Tab identity, changes with CWD, can be duplicated

Proposed Solution

Option A: New focus-tab Subcommand with Session Targeting

# Focus tab by its WT_SESSION GUID (most robust)
wt.exe focus-tab --session 5720ee6d-6474-47b0-88db-fa7e10e60d37

# Focus tab by index in current/specified window  
wt.exe focus-tab --index 2
wt.exe -w 0 focus-tab --index 2

Option B: Extend -w to Accept Session References

# Session becomes a first-class window/tab identifier
wt.exe -w session:5720ee6d-6474-47b0-88db-fa7e10e60d37 focus-tab

This aligns with @zadjii-msft's comment in #17963:

"I do have plans to have -w 0 do The Smart Thing, and figure out the right window ID based on the current WT_SESSION_ID"

Option C: Query Command (Enables Advanced Scenarios)

# Get structured info about sessions
wt.exe query --session 5720ee6d-6474-47b0-88db-fa7e10e60d37
# Output: {"window_id": 2, "tab_index": 3, "title": "pwsh - myproject"}

# List all sessions (tmux-style)
wt.exe query --list-sessions
# Output: [{"session": "5720ee6d-...", "window_id": 2, "tab_index": 3}, ...]

Real-World Use Case: Toasty Notification Tool

I'm building Toasty, a Windows toast notification CLI for AI coding agents. The workflow:

┌─────────────────────────────────────────────────────────────────┐
│  1. Claude Code runs in Tab 3, user alt-tabs to browser        │
│  2. Claude finishes → triggers hook → toasty shows toast       │
│  3. Toasty captures WT_SESSION="5720ee6d-..." at invocation    │
│  4. User clicks toast notification                              │
│  5. Toasty calls: wt.exe focus-tab --session 5720ee6d-...      │
│  6. ✓ User lands exactly in Tab 3 where Claude was running     │
└─────────────────────────────────────────────────────────────────┘

Current broken flow: Step 5 doesn't exist, so users land on whatever tab was last active and have to manually find their agent.

Hook Configuration (Already Supported)

Claude Code (~/.claude/settings.json):

{
  "hooks": {
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "toasty.exe \"Claude finished\" --session %WT_SESSION%"
      }]
    }]
  }
}

GitHub Copilot (.github/hooks/toasty.json):

{
  "hooks": {
    "sessionEnd": [{
      "type": "command",
      "powershell": "toasty.exe 'Copilot finished' --session $env:WT_SESSION"
    }]
  }
}
Reference Status Relevance
#17963 Open Discussion WT_WINDOWID request; @zadjii-msft mentioned plans for session-aware -w 0
#10561 Open -w 0 not finding right window from quake mode
#16568 Open General programmatic API request (iTerm2-style)
#13006 Open WT_SESSION not set when Terminal is default terminal
#10708 Open Ability to identify pane IDs

Implementation Considerations

  1. Session → Window/Tab mapping - Terminal already maintains this internally for tab management

  2. Cross-process communication - Could use existing named pipe/COM infrastructure that wt.exe uses for -w targeting

  3. Focus stealing - Windows has restrictions; but Terminal already handles this for -w commands

  4. Backward compatibility - New subcommand/flag, no breaking changes

Summary

Terminal Focus Specific Tab? Query Sessions? API Maturity
tmux select-window -t list-sessions Production
iTerm2 Python API Full introspection Production
Kitty kitty @ focus-tab kitty @ ls Production
Windows Terminal Gap

Windows Terminal has all the internal machinery (WT_SESSION, window management, shell integration). The missing piece is exposing tab activation to external processes.

This feature would make Windows Terminal a first-class citizen for the emerging AI agent workflow—and close a gap that every other major terminal has already solved.

I'm Happy to Help

I'd be glad to:

  • Test preview builds
  • Provide detailed feedback
  • Contribute to implementation if pointed in the right direction
  • Help document the feature

Environment: Windows 11 24H2, Windows Terminal 1.21+, C++/WinRT

cc @crutkas

Originally created by @shanselman on GitHub (Jan 25, 2026). # Feature Request: Focus/Activate Tab by WT_SESSION ## The Problem **There's no way for an external process to programmatically switch to a specific Windows Terminal tab.** When a process running inside a terminal tab needs to bring the user back to *that exact tab* (not just the window), it's currently impossible. The `wt.exe` CLI can open new tabs and target windows, but cannot activate an existing tab by its session identifier. ## Why This Matters Now: The AI Agent Era 2024-2025 has seen an explosion of **AI coding agents** that run in the terminal: | Agent | Vendor | Typical Task Duration | |-------|--------|----------------------| | **Claude Code** | Anthropic | 30 sec - 10+ min | | **GitHub Copilot CLI** | GitHub/Microsoft | 10 sec - 5 min | | **Gemini CLI** | Google | 10 sec - 5 min | | **Aider** | Open Source | 30 sec - 10 min | | **Cursor** | Anysphere | Variable | These agents run autonomously—refactoring code, running tests, fixing bugs across multiple files. Users naturally **alt-tab away** while waiting. When the agent finishes: 1. A notification appears ("Claude finished refactoring") 2. User clicks the notification 3. Windows Terminal comes to foreground... 4. **❌ But the user is on the wrong tab** and has to hunt for where the agent was running With 5-10 tabs open across different projects, this is a real friction point. ## Prior Art: Every Other Major Terminal Has This ### tmux (Gold Standard) ```bash # Direct tab/pane targeting by session ID tmux select-window -t mysession:2 tmux select-pane -t mysession:2.3 # Query all sessions, windows, panes tmux list-sessions tmux list-windows -t <session> tmux list-panes -t <session:window> ``` ### iTerm2 (macOS) ```python # Full Python API for programmatic control import iterm2 async def main(connection): app = await iterm2.async_get_app(connection) window = app.current_terminal_window target_tab = window.tabs[2] await window.async_select_tab(target_tab) # ← This is what we need iterm2.run_until_complete(main) ``` Also supports AppleScript: ```applescript tell application "iTerm2" tell current window select tab 2 end tell end tell ``` ### Kitty (Linux/macOS) ```bash # IPC via Unix socket - can focus tabs by index or match criteria kitty @ focus-tab --match index:2 kitty @ focus-window --match title:myproject # Query state kitty @ ls # Returns JSON of all windows, tabs, panes ``` ### Windows Terminal: The Gap ```powershell # We can do this: wt.exe -w 0 nt # New tab in MRU window wt.exe -w 2 split-pane # Split in window 2 # We cannot do this: wt.exe --focus-session $env:WT_SESSION # ❌ Doesn't exist wt.exe -w 0 focus-tab --index 3 # ❌ Doesn't exist wt.exe --query-sessions # ❌ Doesn't exist ``` ## Technical Context ### What Already Exists (The Building Blocks) 1. **`WT_SESSION` environment variable** - Already set per tab/pane, unique GUID like `5720ee6d-6474-47b0-88db-fa7e10e60d37` 2. **`WT_PROFILE_ID` environment variable** - Profile GUID, useful but not unique per tab 3. **Window targeting via `-w`** - Can target windows by ID or name, just not tabs within them 4. **Shell integration OSC sequences** - Terminal already has rich bidirectional communication: ``` OSC 133;A ST - Start of prompt OSC 133;B ST - Start of command input OSC 133;C ST - Command executed OSC 133;D;N ST - Command finished with exit code N OSC 9;9;CWD ST - Current working directory ``` 5. **Internal tab management** - Terminal obviously tracks tabs internally; just not exposed externally ### Current Workarounds (All Fragile) **UI Automation** - Enumerate TabItem elements, match by title, call `SelectionItemPattern.Select()`: ```cpp // This works but is fragile CComPtr<IUIAutomationElement> tabItem; automation->CreatePropertyCondition(UIA_NamePropertyId, tabTitle, &condition); terminalElement->FindFirst(TreeScope_Descendants, condition, &tabItem); // Problem: Tab titles change, can be duplicated, user can rename ``` **Keyboard simulation** - Send Ctrl+Tab repeatedly: ```cpp // Extremely fragile, timing-dependent keybd_event(VK_CONTROL, 0, 0, 0); keybd_event(VK_TAB, 0, 0, 0); // Problem: No way to know when you've reached the right tab ``` **Title matching via console API**: ```cpp wchar_t title[1024]; GetConsoleTitleW(title, 1024); // Problem: Title ≠ Tab identity, changes with CWD, can be duplicated ``` ## Proposed Solution ### Option A: New `focus-tab` Subcommand with Session Targeting ```powershell # Focus tab by its WT_SESSION GUID (most robust) wt.exe focus-tab --session 5720ee6d-6474-47b0-88db-fa7e10e60d37 # Focus tab by index in current/specified window wt.exe focus-tab --index 2 wt.exe -w 0 focus-tab --index 2 ``` ### Option B: Extend `-w` to Accept Session References ```powershell # Session becomes a first-class window/tab identifier wt.exe -w session:5720ee6d-6474-47b0-88db-fa7e10e60d37 focus-tab ``` This aligns with @zadjii-msft's comment in #17963: > "I do have plans to have `-w 0` do The Smart Thing, and figure out the right window ID based on the current `WT_SESSION_ID`" ### Option C: Query Command (Enables Advanced Scenarios) ```powershell # Get structured info about sessions wt.exe query --session 5720ee6d-6474-47b0-88db-fa7e10e60d37 # Output: {"window_id": 2, "tab_index": 3, "title": "pwsh - myproject"} # List all sessions (tmux-style) wt.exe query --list-sessions # Output: [{"session": "5720ee6d-...", "window_id": 2, "tab_index": 3}, ...] ``` ## Real-World Use Case: Toasty Notification Tool I'm building [**Toasty**](https://github.com/shanselman/toasty), a Windows toast notification CLI for AI coding agents. The workflow: ``` ┌─────────────────────────────────────────────────────────────────┐ │ 1. Claude Code runs in Tab 3, user alt-tabs to browser │ │ 2. Claude finishes → triggers hook → toasty shows toast │ │ 3. Toasty captures WT_SESSION="5720ee6d-..." at invocation │ │ 4. User clicks toast notification │ │ 5. Toasty calls: wt.exe focus-tab --session 5720ee6d-... │ │ 6. ✓ User lands exactly in Tab 3 where Claude was running │ └─────────────────────────────────────────────────────────────────┘ ``` **Current broken flow**: Step 5 doesn't exist, so users land on whatever tab was last active and have to manually find their agent. ### Hook Configuration (Already Supported) Claude Code (`~/.claude/settings.json`): ```json { "hooks": { "Stop": [{ "hooks": [{ "type": "command", "command": "toasty.exe \"Claude finished\" --session %WT_SESSION%" }] }] } } ``` GitHub Copilot (`.github/hooks/toasty.json`): ```json { "hooks": { "sessionEnd": [{ "type": "command", "powershell": "toasty.exe 'Copilot finished' --session $env:WT_SESSION" }] } } ``` ## Related Issues & Discussions | Reference | Status | Relevance | |-----------|--------|-----------| | **#17963** | Open Discussion | `WT_WINDOWID` request; @zadjii-msft mentioned plans for session-aware `-w 0` | | **#10561** | Open | `-w 0` not finding right window from quake mode | | **#16568** | Open | General programmatic API request (iTerm2-style) | | **#13006** | Open | `WT_SESSION` not set when Terminal is default terminal | | **#10708** | Open | Ability to identify pane IDs | ## Implementation Considerations 1. **Session → Window/Tab mapping** - Terminal already maintains this internally for tab management 2. **Cross-process communication** - Could use existing named pipe/COM infrastructure that `wt.exe` uses for `-w` targeting 3. **Focus stealing** - Windows has restrictions; but Terminal already handles this for `-w` commands 4. **Backward compatibility** - New subcommand/flag, no breaking changes ## Summary | Terminal | Focus Specific Tab? | Query Sessions? | API Maturity | |----------|:------------------:|:---------------:|:------------:| | tmux | ✅ `select-window -t` | ✅ `list-sessions` | Production | | iTerm2 | ✅ Python API | ✅ Full introspection | Production | | Kitty | ✅ `kitty @ focus-tab` | ✅ `kitty @ ls` | Production | | **Windows Terminal** | ❌ | ❌ | **Gap** | Windows Terminal has all the internal machinery (`WT_SESSION`, window management, shell integration). The missing piece is exposing tab activation to external processes. **This feature would make Windows Terminal a first-class citizen for the emerging AI agent workflow—and close a gap that every other major terminal has already solved.** ## I'm Happy to Help I'd be glad to: - Test preview builds - Provide detailed feedback - Contribute to implementation if pointed in the right direction - Help document the feature --- **Environment**: Windows 11 24H2, Windows Terminal 1.21+, C++/WinRT cc @crutkas
claunia added the Needs-TriageNeeds-Tag-Fix labels 2026-01-31 08:58:44 +00:00
Author
Owner

@o-sdn-o commented on GitHub (Jan 25, 2026):

I’d like to broaden the scope of this issue (perhaps this requires a separate discussion, or perhaps it is fundamentally unacceptable). I’d like to propose a deeper, architectural solution that solves this issue as a subset of a broader strategy for modern terminal interactions.

The immediate problem is a symptom of the current read-only nature of the Terminal <-> Process interface. Instead of implementing ad-hoc fixes for every new feature (like kitty keyboard protocols or in-band resize signaling), we should introduce a Scriptable Middleware Layer within the Terminal itself.

The Proposal: A Scriptable Sandbox

This moves the burden of protocol implementation from the Terminal's C++ core to a safe, application-defined layer.

1. Solving the Tab Focus Issue

A process running inside a session could push a simple Lua script to the terminal's sandbox via APC. This script, with explicit token authorization, could then call a terminal.FocusSession(guid) API, solving the current feature request securely.

2. The Universal Extension Point

This architecture creates a programmable "middleware":

  • Protocol Agnosticism: The terminal core needs no updates for new input standards. The application pushes a Lua handler dynamically.
  • Security (Traffic Analysis): A sandbox allows for features like Application Defined Reporting (ADR) and Constant Bitrate (CBR) traffic to mask input patterns over SSH, eliminating side-channel vulnerabilities.
  • Decoupling Logic: It separates UI behavior (the script) from the application (the agent), improving consistency and performance.

3. Addressing Past Concerns (Isolation)

This approach respects the goal of isolating implementation details. The sandbox is strictly isolated and can only interact via predefined, safe APIs (e.g., terminal.Send and terminal.Listen). This provides robust security while enabling powerful new features needed for AI coding agents.

Conceptual Script Example for Focus Management

A simplified example showing how a process would request focus via the sandbox:
lua

-- Script pushed by an authorized AI Agent via APC
local session_id = "{YOUR_AGENT_SESSION_GUID}"

-- Request the terminal to activate this specific tab/pane
terminal.FocusSession(session_id)

-- Log that the action was requested (optional)
terminal.Send("Focus request sent for " .. session_id)

Conceptual Example for Application Defined Reporting (ADR)

Overview

Application Defined Reporting (ADR) is a paradigm shift in terminal-to-application communication. Instead of relying on hardcoded keyboard or mouse protocols (like xterm, kitty, or win32-input-mode), ADR allows the application to dynamically define its own event-reporting logic. This is achieved by deploying a lightweight, isolated execution environment (Lua sandbox) on the terminal side.

Architecture

In the terminal-side environment, the input pipeline is transformed into a programmable stream:

  • Source: Physical HID events (keyboard, mouse, focus, system signals) are captured by the terminal core as binary (Lua-aware) messages.
  • The Processor (Lua Sandbox): The application injects a script into the terminal's sandbox via an APC sequence. This script subscribes to specific event IDs (e.g., terminal.Listen("hids::keybd::any", ...)).
  • The Output: The script processes these events and emits custom-formatted data back to the application using an unified interface.

Solving the Obfuscation Leakage

The primary goal of ADR is to eliminate side-channel vulnerabilities (Timing and Packet Size Analysis) that plague traditional protocols:

  • Constant Bitrate (CBR) Enforcing: The Lua script acts as a "metronome." It collects events and flushes them to the network at fixed intervals (e.g., every 20ms).
  • Deterministic Packet Size: By using some binary format, the script ensures that every outbound frame has a fixed length. If no input is detected, the script automatically generates payload::chaff (No-Op) frames.
  • Entropy Suppression: Because the "noise" (chaff) and "signal" (actual input) are formatted identically by the same script, an external observer (e.g., an attacker monitoring the SSH/network traffic) sees a monolithic, unchanging data stream with zero information leakage.

Key Advantages

• Protocol Agnosticism: The terminal no longer needs to be updated to support new input standards. If an application needs a new way to report 3D mouse coordinates or pressure-sensitive keys, it simply pushes a new Lua handler.
• True Sandbox Isolation: The Lua environment is strictly isolated with no access to system APIs or direct network sockets. It can only communicate via the terminal.Send abstraction, ensuring that ADR cannot be weaponized to exfiltrate local data.
• Bidirectional Integrity: Modern shells and TUIs can coordinate with the terminal to establish a "Full-Duplex Constant Stream," where the server-side "Pongs" every "Ping" from the terminal's ADR script, masking server response times.

Notes

The ADR implementation must ensure that the transition between the default input mode and the scriptable mode is atomic. Once the hids subscription is active in the sandbox, raw terminal sequences should be suppressed to prevent "double-reporting" and metadata leaks.

ADR Initiation Script (Lua) 

This script demonstrates how an application can "push" logic into the terminal sandbox to activate ADR mode with built-in obfuscation. It transforms the input stream into a protected, constant-rate flow. 

-- ADR Obfuscation Script for terminal
-- Purpose: Intercept keyboard events and enforce a 20ms Constant Bitrate (CBR) stream.

local event_buffer = {}
local FRAME_SIZE = 64  -- Fixed binary frame size
local TICK_RATE = 20   -- Transmission interval in milliseconds (50 FPS)

-- 1. Subscribe to keyboard events (ADR)
-- We intercept all keystrokes and store them in a local buffer.
terminal.Listen("hids::keybd::any", function(e)
    -- In terminal, 'e' is already a structured binary object/binary string.
    -- We push it into our queue for synchronized transmission.
    table.insert(event_buffer, e)
end)

-- 2. Chaff Generation Function (Noise)
-- Creates a dummy NOP packet that is indistinguishable from real input in size.
local function generate_chaff()
    -- Returns a binary NOP (No-Operation) message padded to FRAME_SIZE.
    return string.rep("\0", FRAME_SIZE) 
end

-- 3. Main Obfuscation Loop (Metronome)
-- Executed strictly every 20ms.
terminal.SetTimer(TICK_RATE, function()
    local payload = ""

    if #event_buffer > 0 then
        -- Pop the real event from the queue
        local raw_event = table.remove(event_buffer, 1)
        
        -- Apply Padding to ensure the packet size doesn't leak the event type.
        -- Whether it's "Enter" or "Ctrl+Alt+F12", the size remains constant.
        payload = pad_to_size(raw_event, FRAME_SIZE)
    else
        -- If no user activity, send a Chaff (noise) packet.
        payload = generate_chaff()
    end

    -- Send the prepared binary frame.
    -- To an external observer, this call happens continuously with identical weight.
    terminal.Send(payload)
end)

-- Helper function: Padding
function pad_to_size(data, size)
    if #data >= size then
        return string.sub(data, 1, size)
    else
        return data .. string.rep("\0", size - #data)
    end
end

-- Notify the server-side application that ADR mode is now active.
terminal.Send("ADR_MODE_ACTIVE")

Key Technical Takeaways

  • Stealth: The terminal.SetTimer ensures that data packets are dispatched on a strict schedule, regardless of whether the user is typing or idle.
  • Binary Uniformity: By leveraging the binary format, even when pad_to_size is applied, the packet structure remains a valid frame for the server-side parser.
  • Zero Information Leakage: An attacker monitoring the SSH/network traffic will observe a stream of packets (~100-120 bytes including headers) every 20ms with no detectable spikes or pauses.
@o-sdn-o commented on GitHub (Jan 25, 2026): I’d like to broaden the scope of this issue (perhaps this requires a separate discussion, or perhaps it is fundamentally unacceptable). I’d like to propose a deeper, architectural solution that solves this issue as a subset of a broader strategy for modern terminal interactions. The immediate problem is a symptom of the current read-only nature of the Terminal <-> Process interface. Instead of implementing ad-hoc fixes for every new feature (like kitty keyboard protocols or in-band resize signaling), we should introduce a Scriptable Middleware Layer within the Terminal itself. # The Proposal: A Scriptable Sandbox This moves the burden of protocol implementation from the Terminal's C++ core to a safe, application-defined layer. ## 1. Solving the Tab Focus Issue A process running inside a session could push a simple Lua script to the terminal's sandbox via APC. This script, with explicit token authorization, could then call a terminal.FocusSession(guid) API, solving the current feature request securely. ## 2. The Universal Extension Point This architecture creates a programmable "middleware": - Protocol Agnosticism: The terminal core needs no updates for new input standards. The application pushes a Lua handler dynamically. - Security (Traffic Analysis): A sandbox allows for features like Application Defined Reporting (ADR) and Constant Bitrate (CBR) traffic to mask input patterns over SSH, eliminating side-channel vulnerabilities. - Decoupling Logic: It separates UI behavior (the script) from the application (the agent), improving consistency and performance. ## 3. Addressing Past Concerns (Isolation) This approach respects the goal of isolating implementation details. The sandbox is strictly isolated and can only interact via predefined, safe APIs (e.g., terminal.Send and terminal.Listen). This provides robust security while enabling powerful new features needed for AI coding agents. ## Conceptual Script Example for Focus Management > A simplified example showing how a process would request focus via the sandbox: > lua > > ```lua > -- Script pushed by an authorized AI Agent via APC > local session_id = "{YOUR_AGENT_SESSION_GUID}" > > -- Request the terminal to activate this specific tab/pane > terminal.FocusSession(session_id) > > -- Log that the action was requested (optional) > terminal.Send("Focus request sent for " .. session_id) > ``` ## Conceptual Example for Application Defined Reporting (ADR) > > ### Overview > > Application Defined Reporting (ADR) is a paradigm shift in terminal-to-application communication. Instead of relying on hardcoded keyboard or mouse protocols (like xterm, kitty, or win32-input-mode), ADR allows the application to dynamically define its own event-reporting logic. This is achieved by deploying a lightweight, isolated execution environment (Lua sandbox) on the terminal side. > > ### Architecture > > In the terminal-side environment, the input pipeline is transformed into a programmable stream: > > - Source: Physical HID events (keyboard, mouse, focus, system signals) are captured by the terminal core as binary (Lua-aware) messages. > - The Processor (Lua Sandbox): The application injects a script into the terminal's sandbox via an APC sequence. This script subscribes to specific event IDs (e.g., terminal.Listen("hids::keybd::any", ...)). > - The Output: The script processes these events and emits custom-formatted data back to the application using an unified interface. > > ### Solving the Obfuscation Leakage > > The primary goal of ADR is to eliminate side-channel vulnerabilities (Timing and Packet Size Analysis) that plague traditional protocols: > - Constant Bitrate (CBR) Enforcing: The Lua script acts as a "metronome." It collects events and flushes them to the network at fixed intervals (e.g., every 20ms). > - Deterministic Packet Size: By using some binary format, the script ensures that every outbound frame has a fixed length. If no input is detected, the script automatically generates payload::chaff (No-Op) frames. > - Entropy Suppression: Because the "noise" (chaff) and "signal" (actual input) are formatted identically by the same script, an external observer (e.g., an attacker monitoring the SSH/network traffic) sees a monolithic, unchanging data stream with zero information leakage. > > ### Key Advantages > > • Protocol Agnosticism: The terminal no longer needs to be updated to support new input standards. If an application needs a new way to report 3D mouse coordinates or pressure-sensitive keys, it simply pushes a new Lua handler. > • True Sandbox Isolation: The Lua environment is strictly isolated with no access to system APIs or direct network sockets. It can only communicate via the terminal.Send abstraction, ensuring that ADR cannot be weaponized to exfiltrate local data. > • Bidirectional Integrity: Modern shells and TUIs can coordinate with the terminal to establish a "Full-Duplex Constant Stream," where the server-side "Pongs" every "Ping" from the terminal's ADR script, masking server response times. > > ### Notes > > The ADR implementation must ensure that the transition between the default input mode and the scriptable mode is atomic. Once the hids subscription is active in the sandbox, raw terminal sequences should be suppressed to prevent "double-reporting" and metadata leaks. > > ### ADR Initiation Script (Lua)  > > This script demonstrates how an application can "push" logic into the terminal sandbox to activate ADR mode with built-in obfuscation. It transforms the input stream into a protected, constant-rate flow.  > > ```lua > -- ADR Obfuscation Script for terminal > -- Purpose: Intercept keyboard events and enforce a 20ms Constant Bitrate (CBR) stream. > > local event_buffer = {} > local FRAME_SIZE = 64 -- Fixed binary frame size > local TICK_RATE = 20 -- Transmission interval in milliseconds (50 FPS) > > -- 1. Subscribe to keyboard events (ADR) > -- We intercept all keystrokes and store them in a local buffer. > terminal.Listen("hids::keybd::any", function(e) > -- In terminal, 'e' is already a structured binary object/binary string. > -- We push it into our queue for synchronized transmission. > table.insert(event_buffer, e) > end) > > -- 2. Chaff Generation Function (Noise) > -- Creates a dummy NOP packet that is indistinguishable from real input in size. > local function generate_chaff() > -- Returns a binary NOP (No-Operation) message padded to FRAME_SIZE. > return string.rep("\0", FRAME_SIZE) > end > > -- 3. Main Obfuscation Loop (Metronome) > -- Executed strictly every 20ms. > terminal.SetTimer(TICK_RATE, function() > local payload = "" > > if #event_buffer > 0 then > -- Pop the real event from the queue > local raw_event = table.remove(event_buffer, 1) > > -- Apply Padding to ensure the packet size doesn't leak the event type. > -- Whether it's "Enter" or "Ctrl+Alt+F12", the size remains constant. > payload = pad_to_size(raw_event, FRAME_SIZE) > else > -- If no user activity, send a Chaff (noise) packet. > payload = generate_chaff() > end > > -- Send the prepared binary frame. > -- To an external observer, this call happens continuously with identical weight. > terminal.Send(payload) > end) > > -- Helper function: Padding > function pad_to_size(data, size) > if #data >= size then > return string.sub(data, 1, size) > else > return data .. string.rep("\0", size - #data) > end > end > > -- Notify the server-side application that ADR mode is now active. > terminal.Send("ADR_MODE_ACTIVE") > ``` > > #### Key Technical Takeaways > > - Stealth: The terminal.SetTimer ensures that data packets are dispatched on a strict schedule, regardless of whether the user is typing or idle. > - Binary Uniformity: By leveraging the binary format, even when pad_to_size is applied, the packet structure remains a valid frame for the server-side parser. > - Zero Information Leakage: An attacker monitoring the SSH/network traffic will observe a stream of packets (~100-120 bytes including headers) every 20ms with no detectable spikes or pauses.
Author
Owner

@shanselman commented on GitHub (Jan 26, 2026):

@o-sdn-o this is a super cool idea but I agree it's WAY broader than just "focus on a WT_SESSION" so I'd move this to its own issue as your idea, while awesome, could get big fast

@shanselman commented on GitHub (Jan 26, 2026): @o-sdn-o this is a super cool idea but I agree it's WAY broader than just "focus on a WT_SESSION" so I'd move this to its own issue as your idea, while awesome, could get big fast
Author
Owner

@DHowett commented on GitHub (Jan 28, 2026):

Per the write-up in #19788, I'm gonna reject this for now in favor of begrudgingly doing OSC777.

@DHowett commented on GitHub (Jan 28, 2026): Per the write-up in #19788, I'm gonna reject this for now in favor of begrudgingly doing OSC777.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#24002