fix: graduate PET configure timeout to avoid killing process prematurely (Fixes #1274)#1275
Merged
karthiknadig merged 4 commits intomainfrom Feb 25, 2026
Merged
fix: graduate PET configure timeout to avoid killing process prematurely (Fixes #1274)#1275karthiknadig merged 4 commits intomainfrom
karthiknadig merged 4 commits intomainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical bug where the Python Environment Tools (PET) process was being killed prematurely when configure requests took longer than 30 seconds on large workspaces. The fix implements a graduated timeout approach that allows PET to continue working through the first timeout and only kills the process after two consecutive timeouts.
Changes:
- Introduced
RpcTimeoutErrorclass for type-safe timeout detection instead of fragile string matching - Implemented graduated configure timeouts: first timeout allows PET to continue, second consecutive timeout kills and restarts the process
- Moved
lastConfigurationcache write to after successful response to ensure failed configures are retried - Prevented double-kill by checking error type in
resolve()anddoRefresh()methods - Added
configureTimeoutCountreset on successful configure and process restart
Comments suppressed due to low confidence (1)
src/managers/common/nativePythonFinder.ts:631
- The configureTimeoutCount should be reset when the configuration options change between configure attempts. Currently, if configuration A times out once, then a different configuration B times out on the next attempt, the process will be killed even though these are different configurations. Consider resetting configureTimeoutCount when sending a new configuration that differs from the one that previously timed out. This could be done by storing the configuration that timed out and comparing it with the new configuration before incrementing the count.
private async configure() {
// Get all extra search paths including legacy settings and new searchPaths
const extraSearchPaths = await getAllExtraSearchPaths();
const options: ConfigurationOptions = {
workspaceDirectories: this.api.getPythonProjects().map((item) => item.uri.fsPath),
environmentDirectories: extraSearchPaths,
condaExecutable: getPythonSettingAndUntildify<string>('condaPath'),
pipenvExecutable: getPythonSettingAndUntildify<string>('pipenvPath'),
poetryExecutable: getPythonSettingAndUntildify<string>('poetryPath'),
cacheDirectory: this.cacheDirectory?.fsPath,
};
// No need to send a configuration request if there are no changes.
if (this.lastConfiguration && this.configurationEquals(options, this.lastConfiguration)) {
this.outputChannel.debug('[pet] configure: No changes detected, skipping configuration update.');
return;
}
this.outputChannel.info('[pet] configure: Sending configuration update:', JSON.stringify(options));
try {
await sendRequestWithTimeout(this.connection, 'configure', options, CONFIGURE_TIMEOUT_MS);
// Only cache after success so failed/timed-out calls will retry
this.lastConfiguration = options;
this.configureTimeoutCount = 0;
} catch (ex) {
if (ex instanceof RpcTimeoutError) {
this.configureTimeoutCount++;
if (this.configureTimeoutCount >= MAX_CONFIGURE_TIMEOUTS_BEFORE_KILL) {
// Repeated configure timeouts suggest PET is truly hung — kill and restart
this.outputChannel.error(
`[pet] Configure timed out ${this.configureTimeoutCount} consecutive times, killing hung process for restart`,
);
this.killProcess();
this.processExited = true;
this.configureTimeoutCount = 0;
} else {
// First timeout — PET may still be working. Let it continue and retry next call.
this.outputChannel.warn(
`[pet] Configure request timed out (attempt ${this.configureTimeoutCount}/${MAX_CONFIGURE_TIMEOUTS_BEFORE_KILL}), ` +
'will retry on next request without killing process',
);
}
} else {
this.outputChannel.error('[pet] configure: Configuration error', ex);
}
throw ex;
}
}
…ear lastConfiguration on failure (PR #1275)
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/managers/common/nativePythonFinder.ts:165
sendRequestWithTimeout()doesn't clear thesetTimeout()timer when the JSON-RPC request completes successfully. If the request resolves beforetimeoutMs, the timer will still fire later, callcts.cancel(), and rejecttimeoutPromise, which can cause an unhandled promise rejection and potentially cancel a later/finished request unexpectedly. Refactor so the timeout handle is accessible outsidetimeoutPromiseand is alwaysclearTimeout()'d in afinallyonce the race settles (or otherwise ensure the timeout promise cannot reject after success).
const cts = new CancellationTokenSource();
const timeoutPromise = new Promise<never>((_, reject) => {
const timer = setTimeout(() => {
cts.cancel();
reject(new RpcTimeoutError(method, timeoutMs));
}, timeoutMs);
// Clear timeout if the CancellationTokenSource is disposed
cts.token.onCancellationRequested(() => clearTimeout(timer));
});
try {
return await Promise.race([connection.sendRequest<T>(method, params, cts.token), timeoutPromise]);
} finally {
cts.dispose();
}
src/test/managers/common/nativePythonFinder.configureTimeout.unit.test.ts
Show resolved
Hide resolved
…gureRetryState with full test coverage (PR #1275)
dmitrivMS
approved these changes
Feb 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix PET configure timeout killing the process prematurely, which prevents recovery on large workspaces.
Changes
RpcTimeoutErrorclass with amethodproperty for type-safe timeout detection instead of fragile string matchinglastConfigurationcache write to after successful response — previously a failed configure would cache the options, causing subsequent calls to skip the configure request entirelylastConfigurationon any configure failure to ensure retries are not skipped byconfigurationEqualsresolve()anddoRefresh()no longer kill the process on configure timeouts (already handled insideconfigure())configureTimeoutCounton successful configure and on process restartRoot Cause
When PET takes >30s to process configuration (e.g., large filesystem with glob patterns like
**/.venvin dev containers), the extension immediately killed it and restarted. PET then had to re-process the same configuration from scratch, hitting the same timeout again, burning through all 3 restart attempts and leaving PET permanently dead.Fixes #1274
Fixes #1266