Skip to content

Fix BYOK Quota exceeded caused by Intent Detection with Copilot model #228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fvclaus
Copy link

@fvclaus fvclaus commented Jul 12, 2025

Fixes microsoft/vscode#251944

Issue

Currently, the intent detection is hard-coded to use the copilot gpt-4o-mini model. This is an issue when you have exceeded your chat quota.

This will result in either one of the two errors

image
Sorry, your request failed. Reason: Canceled

This is caused after the BYOK Chat Request returns with a 200. The code will reset the copilot token, because it doesn't check that it originated from the Copilot API

if (response.status === 200 && authenticationService.copilotToken?.isFreeUser && authenticationService.copilotToken?.isChatQuotaExceeded) {
		authenticationService.resetCopilotToken();
}

This also causes unnecessary network requests

The second error I have seen (on possibly older versions of the plugin) is the Quota exceeded error

image

Solution

I have tried to fix it in a way that is extensible later on

  • Decouple the intent models from the regular models (although they are currently the same, this leaves the option to change it later)
  • Design the API of the new service in a way so that the intent detection model can be set per language model (currently it is only one intent detection model for all models)
  • The simplest fix would have been to use the current model of the Chat Request for intent detection, but that is potentially wasteful with expensive models and this isn't transparent to the user that another API request is sent behind the scenes
  • Not mix the BYOK Code with the other code, so I made the BYOK code register its models on startup at the new service
  • For all copilot models gpt-4o-mini is still used. I incorporated this into the message of the quick pick for users who want to have the same behaviour as before
  • I didn't want to mix intent detection with a copilot model with a chat request with a BYOK model, because this is makes it more complicate and hard to reason about the different states of the system

With this MR there is an error message the first time a BYOK model is used
image

The button opens a quick pick
image

@fvclaus fvclaus changed the title Fix BYOK Quota exceeded caused by Intent Detection Fix BYOK Quota exceeded caused by Intent Detection with Copilot model Jul 12, 2025
@fvclaus
Copy link
Author

fvclaus commented Jul 13, 2025

I have moved the new service to common to fix the test. vscode dependencies are imported dynamically. Maybe there is a better way. The tests run fine in my repo now.

@fvclaus
Copy link
Author

fvclaus commented Jul 13, 2025

I have released the code. You can download it from here: https://github.com/fvclaus/vscode-copilot-chat/releases/tag/v0.29.0-byok-fix-v1

To install it, go to settings, click on the three dots and select "Install from VSIX..."
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Copilot Free users with exhausted Chat requests can no longer use BYOK
2 participants