Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv store support and clients #155

Open
minrk opened this issue Feb 21, 2023 · 10 comments
Open

kv store support and clients #155

minrk opened this issue Feb 21, 2023 · 10 comments

Comments

@minrk
Copy link
Member

minrk commented Feb 21, 2023

We're in a bit of a weird situation with Key-Value (KV) store support. There don't appear to be any maintained clients for etcd or consul in Python, which is a bit weird. Traefik supports several KV stores, and we happened to pick etcd and consul. Not for any hugely specific reason, but they are single binaries, which makes them easy to install.

We've been using https://github.com/kragniz/python-etcd3 which is mostly unmaintained, and a breakage in grpcio prompted a few "works for me" forks, which may or may not take over, or end up abandoned, too. https://opendev.org/openstack/etcd3gw appears to be maintained, but doesn't seem to be meant for use by anyone, given its lack of documentation or any publicly-facing bug reporting, contributions, or anything, and the fact that roughly the only thing in its docs - a pip install command - has the wrong package name. python-consul2 also appears to be abandoned with no real candidate for an alternative.

grpcio/protobuf in general seems to be not a good stack for Python clients, which I think would be better served with far simpler, more stable http APIs.

I don't think we really care what's used, and the Python redis API situation is far healthier than etcd or consul. All we really care about is being able to support multiple traefik replicas in z2jh.

I think bootstrapping the KV store is far less important than traefik itself, because any situation where a KV backend is used, the KV store is almost guaranteed to be run separately via a container (there's no real reason to use KV on a single machine like littlest-jupyterhub, where files work just fine), so there's ~no situation where I imagine the install.py bootstrapping of a kv store to be useful in practice, and certainly not worth the relatively high maintenance cost of keeping install.py updated vs the small cost of end-users installing a single binary of their choice.

So the question is:

  1. what KV stores do we support?
  2. what tools do we support installing ourselves (just traefik, or traefik, etcd, consul, etc.)?

I currently think we should:

  • remove etcd and consul from install.py, leave it just for traefik
  • maybe deprecate consul support altogether (don't delete it because it works, but don't put more effort into maintaining it)
  • consider adding redis, as the far healthier option on the Python side
  • consider rewriting etcd to use HTTP instead of any etcd3 Python client. Our uses are so minimal, that this may be the simplest approach
@manics
Copy link
Member

manics commented Feb 21, 2023

I think supporting redis (as the default?) distributed KV store makes sense. It's widely used and understood, the python module is maintained by redis, it's easily runnable as a container/helm chart as well as a fully managed cloud service, and if you did want to install it on a VM it's most likely in your Linux distribution repository.

I don't think we need to support installing redis/etcd/consul in the installer since there's a file backend https://github.com/jupyterhub/traefik-proxy/blob/main/jupyterhub_traefik_proxy/fileprovider.py

This would be similar to how JupyterHub supports multiple databases like PostgreSQL and MySQL, but only sqlite is supported out of the box, and we don't include other databases as part of the installation process.

@GeorgianaElena
Copy link
Member

I'm +1 in keeping only traefik in the installer.

About kv stores supported, I'd also advocate for adding redis from a maintability point of view and deprecate both consul and etcd.

consider rewriting etcd to use HTTP instead of any etcd3 Python client. Our uses are so minimal, that this may be the simplest approach

I believe this makes sense to be in an issue that could be implemented when or if need be.

@manics
Copy link
Member

manics commented Feb 21, 2023

Does anyone remember the background decisions that led to the choice of consul and etcd? This would help us decide whether to keep, deprecate or drop them.

@alexleach
Copy link
Contributor

Agreed, I also thought install.py was a little unnecessary, and agree that it adds an unnecessary maintenance burden, with having to change the checksums, etc. I was completely unaware that python-etcd3 and python-consul were no longer maintained, though.

@alexleach
Copy link
Contributor

On a slightly related, but separate note...
I guess (because I've never deployed a Kubernetes cluster) TLJH describes how to deploy a Kubernetes cluster with jupyter-traefik-proxy running as a service. Personally, I use docker-compose to run jupyterhub with jupyterhub-traefik-proxy in one service and traefik in another service (actually in a completely separate docker-compose project). I don't bother with etcd or consul backends, as I run this on a single host, so I personally find the high availability backends unnecessary. What I'm getting at, is I think an example / minimal docker-compose file and related config files and documentation would be useful. Thoughts? I'm appy to put some time into this.

@minrk
Copy link
Member Author

minrk commented Feb 24, 2023

Does anyone remember the background decisions that led to the choice of consul and etcd? This would help us decide whether to keep, deprecate or drop them.

IIRC (maybe @GeorgianaElena remembers better), etcd was selected as the first, just because it was the first and simplest kv store that came to mind. We picked up consul due to apparent performance issues with etcd (#56). Both being simple go binaries also makes them easy to install/deploy, e.g. for tests, but I don't think they were chosen with great care.

traefik config-loading seems to be incredibly slow compared to CHP, but we need to revisit the benchmarks to get an updated comparison (#163). Maybe we can get redis in there as well. I can't seem to find a benchmark of traefik's KV performance for different providers. The main consideration is traefik key-value watch performance, which does seem to vary across KV implementations, at least in traefik 1.x.

@minrk
Copy link
Member Author

minrk commented Mar 13, 2023

Exploring consul clients a bit more, there's:

  • hc-pyconsul, which appears to be brand new and active, but only created/used by one person so far
  • py-consul is a slightly less outdated fork than python-consul2, but explicitly temporary fork that doesn't allow Issues so isn't really planned as a stable client

@minrk
Copy link
Member Author

minrk commented Mar 17, 2023

after #185, it should be a lot easier to add KV implementations like redis, since only 3 methods need to be implemented - generic methods to add, remove, and get keys from a kv store.

@minrk
Copy link
Member Author

minrk commented Jan 19, 2024

btw, I found etcd3gw's development page, which I couldn't find last time since all of its official links are broken. It's still clearly actively maintained, but shows all signs of being a purely internal tool, not meant for public use:

The next time etcd breaks, if it happens, I think we should either:

  • drop etcd3 support, or
  • use etcd3gw and vendor Calico's auth-adding subclass

@aberey
Copy link

aberey commented Apr 22, 2024

Great to see redis support getting merged! Is this going to be released anytime soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants