Feature Request: 🛠 Support for local LLMs tools in Terminal Chat like Ollama #20990

Open
opened 2026-01-31 07:29:53 +00:00 by claunia · 4 comments
Owner

Originally created by @Samk13 on GitHub (Dec 14, 2023).

Description of the new feature/enhancement

The Windows Terminal Chat currently only supports Azure OpenAI Service. This restriction limits developers who work with or are developing their own local Large Language Models (LLMs), or using tools such as Ollama and need to interface with them directly within the Terminal.
The ability to connect to a local LLM service would allow for better flexibility, especially for those concerned with privacy, working offline, or dealing with sensitive information that cannot be sent to cloud services.

Proposed technical implementation details (optional)

include functionality to support local LLM services by allowing users to configure a connection to local AI models. This would involve:

  1. Provide an option in the Terminal Chat settings to specify the endpoint of a local LLM service.
  2. Allowing the user to set the port that the local LLM service should listen to for incoming requests.

Thanks!

Originally created by @Samk13 on GitHub (Dec 14, 2023). <!-- 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 I ACKNOWLEDGE THE FOLLOWING BEFORE PROCEEDING: 1. If I delete this entire template and go my own path, the core team may close my issue without further explanation or engagement. 2. If I list multiple bugs/concerns in this one issue, the core team may close my issue without further explanation or engagement. 3. If I write an issue that has many duplicates, the core team may close my issue without further explanation or engagement (and without necessarily spending time to find the exact duplicate ID number). 4. If I leave the title incomplete when filing the issue, the core team may close my issue without further explanation or engagement. 5. If I file something completely blank in the body, the core team may close my issue without further explanation or engagement. All good? Then proceed! --> # Description of the new feature/enhancement The Windows Terminal Chat currently only supports Azure OpenAI Service. This restriction limits developers who work with or are developing their own local Large Language Models (LLMs), or using tools such as Ollama and need to interface with them directly within the Terminal. The ability to connect to a local LLM service would allow for better flexibility, especially for those concerned with privacy, working offline, or dealing with sensitive information that cannot be sent to cloud services. # Proposed technical implementation details (optional) include functionality to support local LLM services by allowing users to configure a connection to local AI models. This would involve: 1. Provide an option in the Terminal Chat settings to specify the endpoint of a local LLM service. 2. Allowing the user to set the port that the local LLM service should listen to for incoming requests. Thanks!
claunia added the Issue-FeatureNeeds-Tag-FixProduct-TerminalArea-Chat labels 2026-01-31 07:29:54 +00:00
Author
Owner

@dossjjx commented on GitHub (May 28, 2024):

Would love to see this feature. Phi models would be great for this.

@dossjjx commented on GitHub (May 28, 2024): Would love to see this feature. Phi models would be great for this.
Author
Owner

@g0t4 commented on GitHub (Jun 17, 2024):

As a workaround, I setup https://github.com/g0t4/term-chat-ollama as an intermediate "proxy" that can forward requests to any OpenAI compat completions backend... i.e. ollama, OpenAI, groq.com, etc

FYI, video overview here: https://youtu.be/-QcSRmrsND0

@dossjj with this, you can use phi3 by setting the endpoint to https://fake.openai.azure.com:5000/answer?model=phi3

@g0t4 commented on GitHub (Jun 17, 2024): As a workaround, I setup https://github.com/g0t4/term-chat-ollama as an intermediate "proxy" that can forward requests to any OpenAI compat completions backend... i.e. ollama, OpenAI, groq.com, etc FYI, video overview here: https://youtu.be/-QcSRmrsND0 @dossjj with this, you can use phi3 by setting the endpoint to `https://fake.openai.azure.com:5000/answer?model=phi3`
Author
Owner

@schaveyt commented on GitHub (Nov 1, 2024):

Please just implement this as an OpenAI Compatiable interface.

They simply require these 3 peices of information:

OPENAPI_BASE_URL="http://some-endpoint/api"
OPENAPI_KEY="12345"
OPENAPI_MODEL="gpt-4o"

@schaveyt commented on GitHub (Nov 1, 2024): Please just implement this as an OpenAI Compatiable interface. - [Ollama now supports this since Feb 2024](https://ollama.com/blog/openai-compatibility) - LiteLLM is used as a proxy to many hosted backends for businesses - Anthorpic models can be put behind the LiteLLM and invoked this way They simply require these 3 peices of information: OPENAPI_BASE_URL="http://some-endpoint/api" OPENAPI_KEY="12345" OPENAPI_MODEL="gpt-4o"
Author
Owner

@SamAcctX commented on GitHub (Mar 5, 2025):

Just wanted to add my $.02.

I decided to try a POC and manually updated the hard-coded OpenAI URL and Model here: src/cascadia/QueryExtension/OpenAILLMProvider.cpp to my local ollama OpenAI endpoint and a model I already had, and did a build. I was able to use the chat feature perfectly with Ollama as the back-end. To flesh out the request in a bit more detail...

  1. Make the openAIEndpoint user-configurable - at least as a base URL (e.g. https://ollama.mydoman.com or http://localhost:11434). Up to you whether or not you want to hard-code the /v1 part of the full URI, or have the user include it in the input.
  • Personally, I'd probably lean towards the hard-coding of that bit as most 3rd-party LLM apps that have an OpenAI-compliant API also include the /v1 in their custom APIs
  1. Make the OpenAI API token an optional field. It's not required for Ollama and many others, though if they don't have API authorization configured, they tend to just ignore the authorization header entirely.

  2. Instead of hard-coding a specific model, use the OpenAI List Models API endpoint to populate what models are available, then let the user select the model from a drop-down menu. The endpoint would be a GET request to OPENAI_BASE_URL/v1/models (e.g. https://api.openai.com/v1/models)

@zadjii-msft - Hope this helps!

PS: Implementation as-described for part 3 above would also resolve https://github.com/microsoft/terminal/issues/18200

@SamAcctX commented on GitHub (Mar 5, 2025): Just wanted to add my $.02. I decided to try a POC and manually updated the hard-coded OpenAI URL and Model here: `src/cascadia/QueryExtension/OpenAILLMProvider.cpp` to my local ollama OpenAI endpoint and a model I already had, and did a build. I was able to use the chat feature perfectly with Ollama as the back-end. To flesh out the request in a bit more detail... 1. Make the `openAIEndpoint` user-configurable - at least as a base URL (e.g. https://ollama.mydoman.com or http://localhost:11434). Up to you whether or not you want to hard-code the `/v1` part of the full URI, or have the user include it in the input. * Personally, I'd probably lean towards the hard-coding of that bit as most 3rd-party LLM apps that have an OpenAI-compliant API also include the `/v1` in their custom APIs 2. Make the OpenAI API token an optional field. It's not required for Ollama and many others, though if they don't have API authorization configured, they tend to just ignore the authorization header entirely. 3. Instead of hard-coding a specific model, use the OpenAI `List Models` API endpoint to populate what models are available, then let the user select the model from a drop-down menu. The endpoint would be a GET request to `OPENAI_BASE_URL/v1/models` (e.g. https://api.openai.com/v1/models) * In addition to being a nice element to have customizable, the actual OpenAI API endpoint for this request requires a valid authentication header, so it's an easy way to validate the API key at the same time. Ref: https://platform.openai.com/docs/api-reference/models/list @zadjii-msft - Hope this helps! PS: Implementation as-described for part 3 above would also resolve https://github.com/microsoft/terminal/issues/18200
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#20990