-
Notifications
You must be signed in to change notification settings - Fork 3.9k
ARROW-2479: [C++] Add ThreadPool class #1953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
65ca6ce to
614644f
Compare
|
There appears to be a failure in the symbol visibility check step: I can't reproduce here (Ubuntu 16.04). It might be related to use of the static singleton pattern (the extraneous symbol is |
pcmoritz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Plasma modifications look good to me!
|
It could be that the symbol depends on the version of the STL implementation. We can add that symbol to |
614644f to
3a54747
Compare
|
There's a crash in the manylinux job, unfortunately it seems the core dump isn't found by our Travis-CI script. |
|
I'm not sure I understand how the Docker container works. Apparently there are two copies of Arrow? One in |
|
The copy in |
|
Also, it seems |
|
That is ok, the first build of Arrow is solely for building parquet-cpp once and not per Python version. |
|
So, I have the gdb backtrace: (there are other threads) Unfortunately no line numbers... |
|
As it's in |
|
It looks worse in that case: |
|
By the way, I'm curious about this comment in symbols.map: It looks like libarrow.so still links to libstdc++ dynamically? |
|
Yes, we still link dynamically and we probably never run on systems that are older than CentOS 5. |
|
Does it mean the comment above about devtoolset and static linking is obsolete? |
yes
We hope that it works. Sadly this is not always the case. Thus my yet failed attempt to fix this: #1464 |
|
Ok, for the record, the devtoolset doesn't automatically link statically, but it has a linker script to fetch missing symbols from the statically-compiled version: And the thread-local "once_call" symbols (used by the Here's the |
|
If I look back at gdb traceback in #1953 (comment), my impression is the following:
Since frame 1 executes inside |
|
Some debug prints inside the |
|
(by the way, this is because libstdc++ implements setting a future's result using |
|
By letting the |
xhochy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM
|
The remaining question here is about the APIs that take an explicit |
|
Regarding the
|
|
It seems that by default we will start with a constant of 4 threads in the threadpool (looking into |
It uses the number returned by
Yeah, that's something we'll have to do. There's some interesting discussion here about which heuristic to use exactly: |
5d6f7bd to
cea94b4
Compare
|
Sorry I missed the ping here; I was traveling and had my GitHub notifications muted |
|
I will do a round of code review; if you want to make any changes then this can go in a new patch |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this @pitrou! I've been wanting a global thread pool for a long time but hadn't gotten around to building it myself. I am curious if the packaged task issue could be resolved by passing std::function<void()>&& task to SpawnReal instead (since task is being moved in that function anyway), hence sidestepping the requirement of being CopyConstructible
| uintptr_t block_size, int num_threads) { | ||
| std::vector<std::thread> threadpool(num_threads); | ||
| // XXX This function is really using `num_threads + 1` threads. | ||
| auto pool = CPUThreadPool(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented in the latest PR about the name for this function -- I would either call it GetCpuThreadPool or cpu_thread_pool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do. Should I open a separate PR for the fixes here or do them as part of the latest PR?
|
|
||
| ThreadPool::~ThreadPool() { ARROW_UNUSED(Shutdown(false /* wait */)); } | ||
|
|
||
| Status ThreadPool::SetCapacity(size_t threads) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per comment on latest PR I would recommend using int threads instead of size_t threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine :-)
| cv_.notify_all(); | ||
| cv_shutdown_.wait(lock, [this] { return workers_.empty(); }); | ||
| if (!quick_shutdown_) { | ||
| DCHECK_EQ(pending_tasks_.size(), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this failure mode is esoteric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a bug, as it's checking an internal invariant.
| : task_(std::make_shared<PackagedTask>(std::forward<PackagedTask>(task))) {} | ||
|
|
||
| void operator()(Args&&... args) { return (*task_)(std::forward<Args>(args)...); } | ||
| std::shared_ptr<PackagedTask> task_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be unique_ptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not, since this must be copyable.
|
|
||
| } // namespace detail | ||
|
|
||
| class ThreadPool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this class appear in the Doxygen docs? If so, should use three forward slashes /// and also add some tags like \brief to the function descriptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. Right now I would consider the ThreadPool class an internal API, and only the top-level function to set its capacity a public API.
|
|
||
| Status st = SpawnReal(detail::packaged_task_wrapper<Result>(std::move(task))); | ||
| if (!st.ok()) { | ||
| throw std::runtime_error(st.ToString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could this exception manifest in real code, or would it be an esoteric failure (a la DCHECK)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It manifests if you call Submit after Shutdown. It's a programmer error, though, not a random runtime condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. If you get a chance, adding a code comment there to this effect would be great
|
Feel free to incorporate any changes into the latest PR |
|
This PR seems to have introduced https://issues.apache.org/jira/browse/ARROW-2657. |
|
You have to import TensorFlow first. TensorFlow is not respecting the manylinux1 standard and is using newer compilers. We've seen bugs related to libstdc++ reported elsewhere |
|
It does look like there's an issue related to that pthread symbol though that we should see if can be fixed |
nthreads, see below)Remaining open question:
nthreadsargument? (the Pandas conversion routines, which are able to convert/copy different columns in parallel)