Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create delete function in manager to free / destroy sequence #36

Closed
axsaucedo opened this issue Sep 3, 2020 · 6 comments · Fixed by #146
Closed

Create delete function in manager to free / destroy sequence #36

axsaucedo opened this issue Sep 3, 2020 · 6 comments · Fixed by #146
Labels
enhancement New feature or request

Comments

@axsaucedo
Copy link
Member

axsaucedo commented Sep 3, 2020

Currently there is only way to create a new sequence but there is no way to destroy the sequence inside the manager

See #113 for the discussion providing further context on the exploration related to this issue. For compleeteness, the conversation is added below:

@alexander-g : @axsaucedo Bump, this is quite important.
If you are unsure about asynchronous operations, it would already help to make those methods private and to only clean up anonymous sequences in synchronous ops, i.e. in evalOpDefault directly after the sequence is evaluated.

@axsaucedo: @alexander-g hmm that is actually a pretty good idea... ok I agree this is quite important, thank you @aliPMPAINT for the initial work, I will pull the branch, and do some deeper testing, as well as expose via python interface to make sure we can merge. Thanks both for driving forward this areas.

@axsaucedo: @alexander-g currently looking at this, I now remember why we don't delete the function right after the sequence is executed. The reason why this is by design is because the sequence actually acquires the GPU memory ownership of the Tensors when running the OpCreateTensors. More specifically, it is currently in the memory hierarchy for the OpBase to have a choice whether to free the Tensors it owns or not - here is the line that shows how OpCreateTensor owns the tensors it uses by setting the last value to true [... code example ...] This is actually tied to the discussion I provided in your #130 PR (#130 (comment)) where I basically mention that there may be a better way to think about the OpTensorCreate operation, and the hierarchy of the Tensor memory management overall. This is something that would require a deeper dive, primarily as I need to check if it can make sense for the memory ownership to no longer be Manager -> Sequence -> Op -> Tensor, and update it to remove Tensor as a dependency of the operation. This could require a less trivial refactor, as it's not clear what that hierarchy would be otherwise. This is teh reason why anonymous Sequences are actually kept. This could actually be change such that anonymous sequences are destroyed after execution, which is what we had at the very beginning but as you can imagine this led to tensor memory being destroyed right after a OpTensorCreate in an anonymous function. But this woudl require specific awareness of users, which would have to always run the OpTensorCreate in a non-anonymous function, and then the rest of the operations in an anonymous function.

@axsaucedo: For completeness it may be worth sharing another piece of insight to provide the full picture. Namely that when I was initially desigining the tensors, i had an idea where instead of having two tensors - a Staging and a Device tensor - all the logic would be contained as a single Tensor. Namely this single tensor would contain both relevant memories for the staging and host tensors. The reason why this was not pursued at the end is because the idea was to provide further access and granularity into how the memory is made available to the user - this way the user can actually decide to build their own OpTensor operations, which may own different tensors, sometimes perhaps destroying the staging tensor completely to avoid having the memory overhead. With this in mind, it could be revisited, but there is that disadvantage that for any device tensor, there would be always the overhead of an extra memory component that would be obscured by the user. The advantage of this is that in this case the staging tensor wouldnt' be created directly by the OpCreate operation, which means that the tensors could be owned by the top level manager. For all of these there are various tradeoffs that could be reassessed. For now, the simplest way to approach this is to enable a flag that alllows for deletion of anonymous sequences as soon as they are executed, which can be used by more advanced users that would know that using the OpCreateTensor would involve memory management, and hence should be executed in a non-anonymous function that should be managed manually (and then deleted explicitly with the deleteNamedSequence.

@axsaucedo axsaucedo added the enhancement New feature or request label Sep 3, 2020
@unexploredtest
Copy link
Contributor

Sorry if this is a silly question but doesn't void freeMemoryDestroyGPUResources(); already do this?

@axsaucedo
Copy link
Member Author

@aliPMPAINT that is partially correct, so the freeMemoryDestroyGPUResources function would do most of the work here, but this task would basically be to create a funciton in the kp::Manager to be able to delete managed sequences that have been created. So basically at this point it's possible to create anonymous and named sequences, but there is no way to delete them. For the named sequences this would be quite easy as it would be a simple function that just removes the seuqences with a particular name, but for anonymous sequences it would be slightly harder as there would need to be an assumption on what the most optimal way of "garbage collecting" previous sequences is.

@unexploredtest
Copy link
Contributor

Yeah I see. Could you also clarify another thing?
So, if there were several anonymous sequences, how can we decide which one we're trying to delete?

@axsaucedo
Copy link
Member Author

So that would fall under the second category mentioned - the ones that would be "garbage collected", as these are anonymously created.

Currently there's no reasonable way to remove them, so there could be various ways in which this could be dealt with.

Initially the easiest thing is to just provide a way to remove named sequences, and provide a function to just clear all anonymous (and optionally also named) sequences.

Later on, there could be a function that returns all the anonymous sequence names, so the user can go through them in any order as required (although I haven't seen so far the need for this).

@unexploredtest
Copy link
Contributor

unexploredtest commented Jan 15, 2021

@axsaucedo Oh I see, thanks.

Initially the easiest thing is to just provide a way to remove named sequences, and provide a function to just clear all anonymous (and optionally also named) sequences.

I don't have much experience with C++ yet, but I'll try implementing these feature.

@axsaucedo
Copy link
Member Author

I've updated the issue with the discussion on #113

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants