Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calling .exit() has unexpected result: wllamaExit is not a function #121

Open
flatsiedatsie opened this issue Sep 29, 2024 · 6 comments
Open

Comments

@flatsiedatsie
Copy link
Contributor

flatsiedatsie commented Sep 29, 2024

Screenshot 2024-09-29 at 14 29 21

I verify that the exit function exists, but calling it results in the error above.

try{
	if(window.llama_cpp_model_being_loaded){
		if(typeof window.llama_cpp_app.unloadModel === 'function'){
			await window.llama_cpp_app.unloadModel();
		}else{
			console.error("window.llama_cpp_app was not null, but had no unloadModel function?  window.llama_cpp_app: ", window.llama_cpp_app);
		}
		
	}
	else{
		if(typeof window.llama_cpp_app.exit === 'function'){
			await window.llama_cpp_app.exit();
		}else{
			console.error("window.llama_cpp_app was not null, but had no exit function?  window.llama_cpp_app: ", window.llama_cpp_app);
		}
	}
}
catch(err){
	console.error("caught error trying to stop/unload Wllama: ", err);
}
@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Sep 29, 2024

I just noticed something while doing mobile debugging. I couldn't figure out why the smallest model (Danube 3 500m, 320MB) wasn't working. According to my debug information the Wllama object was null (which it normally gets set to after calling exit() succesfully.

On mobile the memory debugging was working, so I noticed that Wllama's multi-thread workers still seemed to exist?
Screenshot 2024-09-29 at 14 58 36

Maybe I should never set the Wllama object back to null.

Is there another sure-fireway to fully destroy wllama? The UI allows users to load all kinds of models, some of which are handled by Wllama, but others are handled by WebLLM or even Transformers.js. I try to juggle these 'runners' memory-wise, only allowing one of them to exist at a time.

At least, that was the theory...

@ngxson
Copy link
Owner

ngxson commented Sep 30, 2024

Maybe I should never set the Wllama object back to null.

You can set it to a new wllama instance instead of setting to null

@flatsiedatsie
Copy link
Contributor Author

You can set it to a new wllama instance instead of setting to null

Thanks. Will that kill the workers and unload the model to release the memory properly?

I just noticed that resetWllamaInstance effectively does what you describe.

const resetWllamaInstance = () => {
  wllamaInstance = new Wllama(WLLAMA_CONFIG_PATHS, { logger: DebugLogger });
};

For now I've modified the code to no longer null the Wllama instance, and just unload the model when it's WebLLM's turn. Or could that result in memory not being fully recovered?

@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Oct 1, 2024

I'm running into a situation where await wllama.exit() is stuck. The code (similar to that in the first post) doesn't get beyond it.

Screenshot 2024-10-01 at 15 53 31

I'm trying to unload the old model before loading a new one.

if(typeof window.llama_cpp_app.isModelLoaded != 'undefined'){
	let a_model_is_loaded = await window.llama_cpp_app.isModelLoaded();
	console.warn("WLLAMA: need to unload a model first?: ", a_model_is_loaded, window.llama_cpp_app);
	if(a_model_is_loaded && typeof window.llama_cpp_app.unloadModel != 'undefined'){
		console.log("wllama: unloading loaded model first.  window.llama_cpp_app: ", window.llama_cpp_app);
		await window.llama_cpp_app.unloadModel();
	}
	else if(a_model_is_loaded && typeof window.llama_cpp_app.exit != 'undefined'){
		console.error("wllama: unloading loaded model first by calling exit instead of unloadModel.  window.llama_cpp_app: ", window.llama_cpp_app);
		await window.llama_cpp_app.exit();
		console.log("wllama exited.  window.llama_cpp_app is now: ", window.llama_cpp_app);
	}
	else if(a_model_is_loaded){
		console.error("WLLAMA HAS A MODEL LOADED, BUT NO WAY TO UNLOAD IT?  window.llama_cpp_app: ", window.llama_cpp_app);
		return false;
	}
	create_wllama_object(); // TODO: potential memory leak if the old model isn't unloaded properly first
}
else{
	console.error("llama_cpp_add has no isModelLoaded: ", window.llama_cpp_app);
}

This happens: wllama: unloading loaded model first by calling exit instead of unloadModel
But I never see wllama exited.

@flatsiedatsie
Copy link
Contributor Author

The reason I ask it because I've read that Mobile Safari doesn't clean up orphaned web workers properly.

I'm now attempting this:

if(typeof window.llama_cpp_app.proxy != 'undefined' && window.llama_cpp_app.proxy != null && typeof window.llama_cpp_app.proxy.worker != 'undefined'){
        console.warn("wllama.proxy still existed, attempting to terminate it manually");
        window.llama_cpp_app.proxy.worker.terminate();
}

@flatsiedatsie
Copy link
Contributor Author

Calling window.llama_cpp_app.proxy.worker.terminate(); has been working well for now.

I'll leave this issue open because I'm curious what the recommended route for unloading models is, and how memory can be optimally recovered while keeping an instance of Wllama alive for housekeeping tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants