Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expiry performance #156

Closed
Lars-Meijer opened this issue Mar 8, 2017 · 14 comments
Closed

Expiry performance #156

Lars-Meijer opened this issue Mar 8, 2017 · 14 comments

Comments

@Lars-Meijer
Copy link

I have a question about the automatic expiry of keys.

More specifically I am interested in knowing what performance impact it has if I have a lot (say for example 5 million) of points that all have an expire time?
Is the implementation based on Redis implementation of Expire? (https://redis.io/commands/expire#how-redis-expires-keys)?
And what if I access a key that is expired but has not yet been removed, are they passively removed like it is the case in Redis?
Are there any gurantees on expiry? Redis claims between 0 and 1 millisecond accuracy (https://redis.io/commands/expire#expire-accuracy)

The documentation is not really clear on this subject, I've tried looking at the code of https://github.com/tidwall/tile38/blob/master/controller/expire.go but that didn't help me that much either. (rule 44 also has a spelling error btw, don't think its necessary to make a PR for that)

@m1ome
Copy link
Contributor

m1ome commented Mar 8, 2017

@Lars-Meijer you can look upon code: this lines all stuff runs every 200 ms but we don't have any guarantee about release time. I mean it can be us or can be even close to ms.

All code just runs "DEL" command but without overhead based on server-client interaction.

@Lars-Meijer
Copy link
Author

@m1ome Yeah I saw that, but it seems to me like checking 5 (as in the example) million keys 5x a second would be a huge performance hit. But maybe I am misinterpreting the code.

From the sleep 100 milliseconds it also seems to me that this happens 10x every second (dispite what the doc says), but again, I might be mistaken there.

@tidwall
Copy link
Owner

tidwall commented Mar 8, 2017

More specifically I am interested in knowing what performance impact it has if I have a lot (say for example 5 million) of points that all have an expire time?

That's a good question. I don't have performance metrics around expirations. Though I would expect that 5 million points with all having TTLs shouldn't be a problem in most cases.

Is the implementation based on Redis implementation of Expire?

The Tile38 implementation is roughly based on Redis, but it's performance is not as good. This is primarily because Tile38 uses a background routine that cleans up expired keys using a locking strategy that makes use of mutexes (maybe a little too much). Redis is single threaded and doesn't need locks, it has an event based clean up routine which is lightweight.

And what if I access a key that is expired but has not yet been removed, are they passively removed like it is the case in Redis?

No it's not passive. The Tile38 background routine must remove the key before it disappears from read commands like GET and NEARBY.

Are there any gurantees on expiry?

No. We would need to add the passive removal of a key feature in order to have guarantees. This is because we are relying on the performance background cleanup routine, which could take longer depending on the number of items queued for expiration.

The routine usually executes every 100 milliseconds, and in some cases under a millisecond.

Tile38 expiration logic is stable and adequate in most scenarios, but there's a ton that can be improved upon.

Please let me know if you run into any problems, or have any suggestions.

@Lars-Meijer
Copy link
Author

Maybe it is a possibility to make the background scan timeout configurable as a quick "fix" if the background scan every 100ms becomes a problem. I haven't really tested performance of this myself either. For my use case clean-up within 100 ms is more than adequate, I would be fine with cleanup every second or even every few seconds.

I will let you know if I run into any problems.

@tidwall
Copy link
Owner

tidwall commented Mar 8, 2017

@Lars-Meijer I agree that there may be slow down with the frequency of expire checks times the number of items.

There's quite a few optimizations that can be done to increase performance.

Such as using an ordered list like a btree for the queue instead of hashmap. This would decrease the performance a little on the SET, but it would help tremendously on the expire routine. We could also have a simple passive check for read commands.

Some minor tradeoffs in many cases, but for datasets like yours it may help a lot.

@tidwall
Copy link
Owner

tidwall commented Mar 8, 2017

@Lars-Meijer Sounds good. If you run into performance issues, please let us know right away. I'm sure we could make some tweaks to get it to work for you.

@Lars-Meijer
Copy link
Author

Using some ordered datastructure might be able to speed up the process. SET is now already O(Log n) operation right? An insert into a btree or some other ordered data structure is most likely also O(Log n) and can happen in parallel, so there would not be too much of a performance hit.

@tidwall
Copy link
Owner

tidwall commented Mar 8, 2017

There's the collection list which stores all the objects by id. This is a btree and is O(logn).
Then there's the spatial index which is an rtree O(logMn).
So an additional btree would slow it down a little bit on SETs that have TTLs.

@Lars-Meijer
Copy link
Author

Thanks for clarification. We should keep this in mind if I run into trouble.

The list would be ordered by expiry time right?

@tidwall
Copy link
Owner

tidwall commented Mar 8, 2017

Correct, ordered by expiry time.

@Lars-Meijer
Copy link
Author

Wouldn't this cause problems when updating a key?
Say you have truck1 with a 60 second TTL, and update it 30 seconds later, again with a TTL of 60 seconds. This would mean that the old value would have to be removed and this would be O(n)?

@tidwall
Copy link
Owner

tidwall commented Mar 8, 2017

I don't think it'll be a problem. The expiry list will only contain the key and timestamp pair. The timestamp will be expiration at the time that it was added to the list. Thus we could have multiple pairs with the same key but different timestamps, if additional SETs with the same key were called too soon.

When we arrive at a pair that has an elapsed expiration, we'll delete it from the expiry list. Then check it against the updated expiration belong to the real object. If that too has expired then we can issue a DEL command internally.

This is similar to the double check we do right now.

@Lars-Meijer
Copy link
Author

Lars-Meijer commented Mar 9, 2017

Ah, that makes sense. guess it would work then 👍

tidwall added a commit that referenced this issue Mar 29, 2017
Updated the Tile38 expires to match the Redis implmentation at
https://redis.io/commands/expire#how-redis-expires-keys.

It now supports passive and active expires with sub-millisecond
accuracy.

This addresses issue #156
@tidwall
Copy link
Owner

tidwall commented Mar 29, 2017

@Lars-Meijer I just pushed an update to the master branch with an updated expiration logic.

I decided to mimic Redis as closely as possible instead of introducing another btree. I think it works pretty well, and It now has passive removals with sub-millisecond accuracy.

Let me know if you run into any problems. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants