Use priority queue in ctl::TaskQueue #379

gavv · 2020-05-24T19:17:51Z

Problem

ctl::TaskQueue is a thread-safe task queue, allowing to schedule task for immediate or delayed execution.

Tasks for immediate execution are stored in a lock-free FIFO. Tasks for delayed execution are stored in a sorted linked list. Thus, while we have O(1) pop, the insert operation into a sorted list is O(N), which does not scale to big numbers of tasks.

The typical data structure used in such cases is a priority queue implemented using some heap variation.

Solution

Select a priority queue algorithm meeting our requirements.
Add an implementation to roc_core and cover it with unit tests.
Use it instead of core::List in ctl::TaskQueue.
Add benchmarks for schedule_at and reschedule_at and compare results for old and new implementations.

Requirements

Requirements to the priority queue implementation:

It should be intrusive, i.e. it should not perform allocations by itself, but instead should use data embedded into nodes. The same approach is used for core::List, core::MpscQueue, and core::SharedPtr. The limitation of such approach is that a node can be added only to one container, but this is okay for us.
It should have O(1) top/pop.
Ideally, it should have amortized O(1) insert, but O(log n) is also acceptable.
It should have O(1) membership check (check whether the element belongs to the queue). This can be implemented by just storing a pointer to the containing queue in each node, like we do for core::List.
Whenever possible, the implementation should be compact and readable.
It should NOT be thread-safe. Access to it is already serialized by ctl::TaskQueue.

Notes

Other TaskQueue optimizations:

Ewaolx · 2020-07-04T06:21:28Z

Hi, it seems like all the variants of priority queue has O(log n) delete operation(pop) and some of the variants has O(1) insert operation. So, does this requires O(1) delete operation?

gavv · 2020-07-04T08:53:04Z

@Ewaolx Indeed, thanks for noticing. According to this thread, when choosing between various heaps and trees, we'll have to pay O(log N) either for insert() or for pop(), which makes sense.

In our case, ctl::TaskQueue is used to schedule background low-priority work from real-time threads. Hence, if we should choose between faster insert() or faster pop(), insert() is preferred, since it's usually called when a task is scheduled from real-time thread, and pop() is usually called from TaskQueue background thread.

Ewaolx · 2020-07-04T18:29:21Z

Sounds good. In this case, we may have something like Fibonnaci heap which provides us insert in O(1), and O(log N) for pop operation. Also, I see currently sorted linked list is being used for core:List in ctl::TaskQueue, so that would be replaced by heap and consequently we will have heap insert and delete instead of linked list's, right?

gavv · 2020-07-05T11:00:30Z

In this case, we may have something like Fibonnaci heap which provides us insert in O(1), and O(log N) for pop operation.

Looks like a good option! Are you going to work on this?

Except theoretical big-O complexity, we should also ensure that the actual task insertion performance wont hurt in the good case, where there are not much pending tasks. So we'll need a small benchmark for TaskQueue schedule.

Quick look at the Fibonacci heap algorithm suggest that it should be okay.

Also, I see currently sorted linked list is being used for core:List in ctl::TaskQueue, so that would be replaced by heap and consequently we will have heap insert and delete instead of linked list's, right?

Yes. IIRC, we'll need to replace three methods: fetch_sleeping_task_, insert_sleeping_task_, remove_sleeping_task_.

Ewaolx · 2020-07-05T16:07:53Z

Okay, got it. Thanks for the update.

Yes, I can start working on this. I see roc_ctl is under develop branch, and not on master. So are the guidelines same for building and running for develop branch?

gavv · 2020-07-05T17:32:26Z

Great! Develop branch is quite ahead of master and the guidelines were changed. They're not published on the web site (it's built from master), but you can find them in git:

MBkkt · 2020-07-19T05:25:53Z

something like Fibonnaci heap

I don't recommend using this algorithm, it's good in theory, but in practice there is a huge constant. I advise you to try a regular binary heap, and only if the performance turns out to be insufficient, you should look for optimizations

gavv · 2020-07-19T08:23:34Z

@MBkkt thanks for the notice!

I haven't work much with heaps by myself, but quick googling and even Wikipedia article on Fibonacci heap confirms your point.

I feel that Binary heap isn't very well for us because it has O(log n) insertion, while in our specific case insertion speed is the most critical, since tasks are inserted from latency-sensitive threads.

Wikipedia and this page also suggest Pairing heap as a better (faster in practice) alternative to Fibonacci heap. Actually even theoretically it seems better for our case because Pairing heap insertion seems to be O(1) even in the worst case, while Fibonacci heap insertion is only amortized O(1). The implementation of the Paring heap also looks simpler, arguably.

@Ewaolx what do you think? How far did you get on the Fibonacci heap implementation?

MBkkt · 2020-07-20T13:14:05Z

I understand the concern, but what is the order of the number n? Are you sure the insert will be the bottleneck? In general, given that you want an intrusive queue, probably a binary heap is really not suitable.

Ewaolx · 2020-07-21T05:02:29Z

@gavv Sorry for late reply, but as you mentioned I also think if we want strictly O(1) amortized for insertion then binary heap would not work. Also, pairing heap sounds like a good option if we are concerned about the Fibonnaci's practical performance. We can go ahead with pairing heap in this case.

Let me know what do you think. Thanks.

gavv · 2020-08-06T10:22:37Z

I understand the concern, but what is the order of the number n? Are you sure the insert will be the bottleneck? In general, given that you want an intrusive queue, probably a binary heap is really not suitable.

Usually the number of tasks should be low. However, the key point here is that it can grow uncontrollably during load peaks, and at the same time we don't want latency to grow uncontrollably.

We don't optimize every single piece of code, but try to follow design principles which would keep us from ending up with code that we're unable to optimize at all.

One of such principles chosen in this project is that delays in one thread (no matter of their reason) should not cause delays in other threads. Such design decouples threads in terms of latency and delays, so when it comes to performance problems, you can review threads in isolation, which is much much simpler.

TaskQueue is used for communication between pipeline and control threads. If it's not O(1), then delays in control thread will lead to accumulating tasks, thus causing longer insertion times, thus probably causing delays in pipeline threads. And probably not, who knows. You'll have to profile it each time you're debugging glitches.

So it's very attractive just to use O(1) data structure and be 100% sure that TaskQueue insertion always takes fixed time and control thread is guaranteed not to affect pipeline thread. Then, if you hear glitches, you can just profile pipeline thread in isolation.

As for the specific numbers, as mentioned in the issue, it would be nice to add benchmarks and measure insertion times on different queue sizes.

gavv · 2020-08-06T10:23:48Z

@gavv Sorry for late reply, but as you mentioned I also think if we want strictly O(1) amortized for insertion then binary heap would not work. Also, pairing heap sounds like a good option if we are concerned about the Fibonnaci's practical performance. We can go ahead with pairing heap in this case.

Let me know what do you think. Thanks.

Yeah, let's try a pairing heap then.

ashishsiyag · 2020-09-02T15:11:53Z

Hey @Ewaolx are you working on pairing heap implementation ?

Hassan-A · 2020-09-19T22:51:54Z

Hello @gavv , this seems inactive. Mind if I try as well?

gavv · 2020-09-23T10:24:08Z

Mind if I try as well?

Hi, sure.

divyavit · 2020-10-01T17:03:33Z

Hi, I was interested in working on it and wanted to give it a try.. has it already been done ?

Hassan-A · 2020-10-02T11:09:58Z

Hi @divyavit , still in progress.

gavv added performance help wanted An important and awaited task but we have no human resources for it yet labels May 24, 2020

gavv added the algorithms Algorithms and data structures label May 25, 2020

gavv assigned Ewaolx Jul 5, 2020

gavv unassigned Ewaolx Mar 7, 2021

gavv added the easy hacks The solution is expected to be straightforward even if you are new to the project label Sep 22, 2023

Hassan-A mentioned this issue Oct 1, 2023

Replace List with Pairing Heap in Task Queue #584

Open

gavv assigned Hassan-A Oct 1, 2023

gavv added this to Roc Toolkit Jul 6, 2024

gavv moved this to Help wanted in Roc Toolkit Jul 6, 2024

gavv mentioned this issue Jul 12, 2024

Finish pull request that implements pairing heap #750

Open

gavv mentioned this issue Oct 7, 2024

Updated insert_sleeping_task function to try to improve the complexity time #781

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use priority queue in ctl::TaskQueue #379

Use priority queue in ctl::TaskQueue #379

gavv commented May 24, 2020 •

edited

Loading

Ewaolx commented Jul 4, 2020

gavv commented Jul 4, 2020

Ewaolx commented Jul 4, 2020

gavv commented Jul 5, 2020

Ewaolx commented Jul 5, 2020

gavv commented Jul 5, 2020

MBkkt commented Jul 19, 2020

gavv commented Jul 19, 2020

MBkkt commented Jul 20, 2020

Ewaolx commented Jul 21, 2020

gavv commented Aug 6, 2020

gavv commented Aug 6, 2020

ashishsiyag commented Sep 2, 2020

Hassan-A commented Sep 19, 2020

gavv commented Sep 23, 2020

divyavit commented Oct 1, 2020

Hassan-A commented Oct 2, 2020

Use priority queue in ctl::TaskQueue #379

Use priority queue in ctl::TaskQueue #379

Comments

gavv commented May 24, 2020 • edited Loading

Problem

Solution

Requirements

Notes

Ewaolx commented Jul 4, 2020

gavv commented Jul 4, 2020

Ewaolx commented Jul 4, 2020

gavv commented Jul 5, 2020

Ewaolx commented Jul 5, 2020

gavv commented Jul 5, 2020

MBkkt commented Jul 19, 2020

gavv commented Jul 19, 2020

MBkkt commented Jul 20, 2020

Ewaolx commented Jul 21, 2020

gavv commented Aug 6, 2020

gavv commented Aug 6, 2020

ashishsiyag commented Sep 2, 2020

Hassan-A commented Sep 19, 2020

gavv commented Sep 23, 2020

divyavit commented Oct 1, 2020

Hassan-A commented Oct 2, 2020

gavv commented May 24, 2020 •

edited

Loading