Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading via C++ thread pool of clients #125

Merged
merged 95 commits into from
Jun 5, 2024

Conversation

kentslaney
Copy link
Contributor

I've implemented the C++ thread workers to be FILO under the assumption that reusing a single client is preferable when possible. The current implementation needs C++17 or above to parallelize some of the client setup (std::execution). Building still fails with the required version, but I figured I'd open a draft PR anyway in case there's feedback. I'd especially appreciate advice about a good testing setup for the concurrent code along with any opinions about language bindings.

@lexdene
Copy link
Contributor

lexdene commented Dec 14, 2023

World you please show a benchmark about the performance improvement your modifications have?
If there is no obvious improvement, I'd like to do the multithreading stuffs in Python code rather than inside the code of libmc.

@kentslaney kentslaney marked this pull request as ready for review February 7, 2024 04:16
@kentslaney
Copy link
Contributor Author

kentslaney commented Feb 7, 2024

Alright, I think this is ready to be reviewed/merged. I also rewrote some parts of the README unrelated to this PR for the sake of clarity.

I'll start tracking gevent support for threaded clients in a greenify PR. Ideally that shouldn't affect the implementation here.

Edit: actually, one more thing; I meant to look into golang bindings

@kentslaney
Copy link
Contributor Author

kentslaney commented Feb 8, 2024

It looks like gomemcache is already thread safe, so I'm back to thinking this PR is done.

@kentslaney
Copy link
Contributor Author

kentslaney commented May 31, 2024

I've closed the greenify PR for being unsolvable. Unless you have feedback, this pull request is done. Please squash & merge.

Thanks for your time, I know this is a lot of code to add at one time. The reason that I made the fork is already satisfied by having it under my profile, so the pull request really is up to you.

@lexdene
Copy link
Contributor

lexdene commented Jun 3, 2024

I think the changes under .github/workflows directory are not related.
Would you please revert all changes under this directory?

I do not agree with some changes of readme.
Additionally, I do not want to discuss much about readme here which I think is not related.
Would you please revert all changes in readme and submit another pull request which only contains changes of it so that we can discuss specifically and pointedly.

What is misc/aliases used for?
I have not found any code referencing this file.
If it is possible to delete misc/aliases,
misc/git/debug.patch and misc/git/pre-commit which are only referenced by misc/aliases can also be deleted.

Changes about greenify are also not related I think.

Sorry for replying so slowly, but it is really not easy to look over such a big pull request.
I will be happier if it is splitted into several small ones.

@kentslaney
Copy link
Contributor Author

I think I've now removed all the parts of the pull request that aren't necessary for the ThreadedClient interface in Python.

Quick summary of the parts removed:

  • .github/workflows/golang.yml and .github/workflows/python.yml: changes for nektos/act
  • .github/workflows/manual.yml: shorter, manual tests for faster iteration
  • .gitignore: cygdb (cython debugger) uses /cython_debug
  • Makefile: make clean
  • README.rst: discussed above
  • include/Common.h, src/Common.cpp, and src/Connection.cpp: better naming for some of the things in Add support for UNIX domain sockets #120
  • misc/aliases: executable with shortcuts for commands I regularly used during development
  • misc/git/*: created git hooks that automatically un-staged the setup.py changes I needed for debugging Cython
  • misc/memcached_server: logs all port traffic for debugging reasons
  • misc/runbench.py: raises an exception for tests with incorrect output instead of potentially clogging stdout
  • setup.py: actually I don't remember why this one was there, but it removes some process cleanup steps apparently

I'm currently expecting to make separate PRs to add back the changes in:

  • README.rst
  • include/Common.h, src/Common.cpp, and src/Connection.cpp

Here's some ideas on parts that can further split up the changes, along with what I see as the benefits and reasons not to.

  • benchmarking changes needed for threading
    • pros
      • allows the functional parts to be merged with less code to review
    • cons
      • mostly just moves the changes in runbench.py to a different PR
      • removes an integration test for the threaded code
  • separate the C threading interface from the python one
    • pros
      • fairly even split, significant reduction in largest PR size
    • cons
      • they're already mostly organized by file type
      • wouldn't clarify the changes in either
      • the python interface, most likely to be user facing, has the same critical path

Please let me know if there's any changes that would make reviewing easier.

std::atomic<int> rv = 0;
std::lock_guard<std::mutex> updating(m_fifo_access);
std::for_each(irange(0), irange(m_clients.size()),
//std::for_each(std::execution::par_unseq, irange(0), irange(m_clients.size()),
Copy link
Contributor Author

@kentslaney kentslaney Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loops for initializing and growing the client pool (doubles in size each time it's overrun) were written using for_each in order to support parallelized initialization of the client objects, but, at the scales I tested it at, the parallelization didn't make a difference, thus the commented out execution policy.

self.config(libmc.MC_MAX_CLIENTS, POOL_SIZE)


class FIFOThreadPool(ThreadPool):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is analogous to the C++ LockPool class and might be an easier way to understand the overall control flow

@lexdene lexdene merged commit fb03c5d into douban:master Jun 5, 2024
28 checks passed
@kentslaney
Copy link
Contributor Author

🎉 🎉 🎉

kentslaney added a commit to kentslaney/libmc that referenced this pull request Jun 6, 2024
lexdene pushed a commit that referenced this pull request Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants