Expiry performance #156

Lars-Meijer · 2017-03-08T13:56:08Z

I have a question about the automatic expiry of keys.

More specifically I am interested in knowing what performance impact it has if I have a lot (say for example 5 million) of points that all have an expire time?
Is the implementation based on Redis implementation of Expire? (https://redis.io/commands/expire#how-redis-expires-keys)?
And what if I access a key that is expired but has not yet been removed, are they passively removed like it is the case in Redis?
Are there any gurantees on expiry? Redis claims between 0 and 1 millisecond accuracy (https://redis.io/commands/expire#expire-accuracy)

The documentation is not really clear on this subject, I've tried looking at the code of https://github.com/tidwall/tile38/blob/master/controller/expire.go but that didn't help me that much either. (rule 44 also has a spelling error btw, don't think its necessary to make a PR for that)

m1ome · 2017-03-08T14:39:15Z

@Lars-Meijer you can look upon code: this lines all stuff runs every 200 ms but we don't have any guarantee about release time. I mean it can be us or can be even close to ms.

All code just runs "DEL" command but without overhead based on server-client interaction.

Lars-Meijer · 2017-03-08T14:52:29Z

@m1ome Yeah I saw that, but it seems to me like checking 5 (as in the example) million keys 5x a second would be a huge performance hit. But maybe I am misinterpreting the code.

From the sleep 100 milliseconds it also seems to me that this happens 10x every second (dispite what the doc says), but again, I might be mistaken there.

tidwall · 2017-03-08T14:54:00Z

More specifically I am interested in knowing what performance impact it has if I have a lot (say for example 5 million) of points that all have an expire time?

That's a good question. I don't have performance metrics around expirations. Though I would expect that 5 million points with all having TTLs shouldn't be a problem in most cases.

Is the implementation based on Redis implementation of Expire?

The Tile38 implementation is roughly based on Redis, but it's performance is not as good. This is primarily because Tile38 uses a background routine that cleans up expired keys using a locking strategy that makes use of mutexes (maybe a little too much). Redis is single threaded and doesn't need locks, it has an event based clean up routine which is lightweight.

And what if I access a key that is expired but has not yet been removed, are they passively removed like it is the case in Redis?

No it's not passive. The Tile38 background routine must remove the key before it disappears from read commands like GET and NEARBY.

Are there any gurantees on expiry?

No. We would need to add the passive removal of a key feature in order to have guarantees. This is because we are relying on the performance background cleanup routine, which could take longer depending on the number of items queued for expiration.

The routine usually executes every 100 milliseconds, and in some cases under a millisecond.

Tile38 expiration logic is stable and adequate in most scenarios, but there's a ton that can be improved upon.

Please let me know if you run into any problems, or have any suggestions.

Lars-Meijer · 2017-03-08T15:05:14Z

Maybe it is a possibility to make the background scan timeout configurable as a quick "fix" if the background scan every 100ms becomes a problem. I haven't really tested performance of this myself either. For my use case clean-up within 100 ms is more than adequate, I would be fine with cleanup every second or even every few seconds.

I will let you know if I run into any problems.

tidwall · 2017-03-08T15:05:19Z

@Lars-Meijer I agree that there may be slow down with the frequency of expire checks times the number of items.

There's quite a few optimizations that can be done to increase performance.

Such as using an ordered list like a btree for the queue instead of hashmap. This would decrease the performance a little on the SET, but it would help tremendously on the expire routine. We could also have a simple passive check for read commands.

Some minor tradeoffs in many cases, but for datasets like yours it may help a lot.

tidwall · 2017-03-08T15:08:50Z

@Lars-Meijer Sounds good. If you run into performance issues, please let us know right away. I'm sure we could make some tweaks to get it to work for you.

Lars-Meijer · 2017-03-08T15:27:54Z

Using some ordered datastructure might be able to speed up the process. SET is now already O(Log n) operation right? An insert into a btree or some other ordered data structure is most likely also O(Log n) and can happen in parallel, so there would not be too much of a performance hit.

tidwall · 2017-03-08T15:36:10Z

There's the collection list which stores all the objects by id. This is a btree and is O(logn).
Then there's the spatial index which is an rtree O(logMn).
So an additional btree would slow it down a little bit on SETs that have TTLs.

Lars-Meijer · 2017-03-08T15:46:32Z

Thanks for clarification. We should keep this in mind if I run into trouble.

The list would be ordered by expiry time right?

tidwall · 2017-03-08T15:49:15Z

Correct, ordered by expiry time.

Lars-Meijer · 2017-03-08T16:03:48Z

Wouldn't this cause problems when updating a key?
Say you have truck1 with a 60 second TTL, and update it 30 seconds later, again with a TTL of 60 seconds. This would mean that the old value would have to be removed and this would be O(n)?

tidwall · 2017-03-08T16:22:16Z

I don't think it'll be a problem. The expiry list will only contain the key and timestamp pair. The timestamp will be expiration at the time that it was added to the list. Thus we could have multiple pairs with the same key but different timestamps, if additional SETs with the same key were called too soon.

When we arrive at a pair that has an elapsed expiration, we'll delete it from the expiry list. Then check it against the updated expiration belong to the real object. If that too has expired then we can issue a DEL command internally.

This is similar to the double check we do right now.

Lars-Meijer · 2017-03-09T08:09:18Z

Ah, that makes sense. guess it would work then 👍

Updated the Tile38 expires to match the Redis implmentation at https://redis.io/commands/expire#how-redis-expires-keys. It now supports passive and active expires with sub-millisecond accuracy. This addresses issue #156

tidwall · 2017-03-29T20:01:45Z

@Lars-Meijer I just pushed an update to the master branch with an updated expiration logic.

I decided to mimic Redis as closely as possible instead of introducing another btree. I think it works pretty well, and It now has passive removals with sub-millisecond accuracy.

Let me know if you run into any problems. Thanks

tidwall closed this as completed Mar 29, 2017

tidwall mentioned this issue Jan 30, 2018

Too much memory used when doing aofshrink in master/slave mode #258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expiry performance #156

Expiry performance #156

Lars-Meijer commented Mar 8, 2017

m1ome commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 9, 2017 •

edited

Loading

tidwall commented Mar 29, 2017

Expiry performance #156

Expiry performance #156

Comments

Lars-Meijer commented Mar 8, 2017

m1ome commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 8, 2017

tidwall commented Mar 8, 2017

Lars-Meijer commented Mar 9, 2017 • edited Loading

tidwall commented Mar 29, 2017

Lars-Meijer commented Mar 9, 2017 •

edited

Loading