Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about locking #152

Closed
cantino opened this issue Apr 30, 2015 · 8 comments
Closed

Question about locking #152

cantino opened this issue Apr 30, 2015 · 8 comments
Labels

Comments

@cantino
Copy link

cantino commented Apr 30, 2015

Hey Matthew, thanks for this great project!

We would like to use closure_tree in our product to provide multi-level task lists, but we have concerns around the locking mechanisms. Specifically, it looks like a global advisory lock would be obtained on the Task model whenever _ct_before_destroy is called. I believe this means that only one process could delete a Task at a time? Our Task table is very high volume and we can't realistically lock the full table around deletes. Are the_ct_before_destroy and rebuild! processes themselves non-multi-process-safe, or are these wrapped to protect find_or_create_by_path (which we don't need)? Since maintenance is performed by ActiveRecord callbacks, I'd expect them to be consistent because they live within the ActiveRecord transaction.

If you can help me better understand why the locking is needed, I'd be happy to write up a more detailed description for the README.

@mceachen
Copy link
Collaborator

mceachen commented May 1, 2015 via email

@cantino
Copy link
Author

cantino commented May 1, 2015

Thanks @mceachen. What I'm not clear about is, if edits are happening inside of the ActiveRecord commit transaction, why do we need a table lock at all?

@mceachen
Copy link
Collaborator

mceachen commented May 2, 2015 via email

@cantino
Copy link
Author

cantino commented May 7, 2015

Thanks for clarifying, @mceachen. I think we're going to use a materialized path approach instead since we have a write-heavy application.

@mceachen
Copy link
Collaborator

mceachen commented May 7, 2015

Just for reference, I wrote this gem precisely because materialized paths
don't scale well. That approach requires a table lock per hierarchy
change, and for random edits, you're updating half the rows in your table
per insert
. If they aren't locking, concurrent edits are not being done
safely.

It depends on your write patterns, of course. Benchmark different
approaches by replaying production traffic logs, if you can.

@cantino
Copy link
Author

cantino commented May 7, 2015

Thanks @mceachen, I really appreciate your time and insights, and am concerned that I'm missing something important for the scalability of our efforts.

We're updating the materialized paths of any sub-tasks based on the change of a task's parent_id. The update is a single SQL statement:

self.class.where("materialized_path like ?", old_path_string + "%").update_all ["materialized_path = REPLACE(materialized_path, ?, ?)", old_path_string, path_string]

If this is happening inside of a Rails ActiveRecord transaction that also contains the change to parent_id itself, why would we have concurrency issues? I could imagine deadlock issues, but I don't see how the data could end up in an inconsistent state. However, I readily admit that I may not understand transactions as well as I should.

/cc @randycoulman, @shirish-pampoorickal

@mceachen
Copy link
Collaborator

mceachen commented May 7, 2015 via email

@cantino
Copy link
Author

cantino commented May 7, 2015

Gotcha. That's a valid concern, and we will need to keep an eye on our read performance. Thanks for all your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants