-
-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalability issue with authentication ? #979
Comments
Please note that I increased the authentication client timeout to 20s while the default configuration is 5s, making the tests break even before 60 concurrent clients with the default Feathers setup. |
This has already been discussed in feathersjs-ecosystem/authentication-local#70. From the BCrypt.js documentation:
If you require better performance you have to implement a different password hashing strategy. |
I knew that password hashing is a slow process but here I simply perform authentication with an existing user/password, does password comparison also suffer this scalability issue as far as you know ? |
Yes. The plain text password has to go through the same hashing mechanism in order to compare it. |
OK this is a really good explanation thanks ! However don't you think these numbers to be too much high ? I am gathering some performance benchmarks about bcrypt and with a work factor of 12 it is about hundreds of milliseconds on a laptop to perform hashing, far away from 20s (my timeout setup), e.g. this one from 2012, this one from 2018, etc. Maybe you can explain a little bit the logic behind https://github.com/feathersjs/feathers/blob/master/packages/authentication-local/lib/utils/hash.js, which seems to increase the work factor depending on the date to anticipate computer power increase I guess. Do you know which work factor is used by default in Feathers now ? It seems this changed at some point because we used the local authentication module v0.4.3 for a long time and bcrypt setup was less complex, see https://github.com/feathersjs/authentication-local/blob/v0.4.3/src/utils/hash.js. This might explain why we didn't notice it previously. Maybe the problem is also related to the main loop being polluted by bcrypt processing so that requests stack-up. I will try to perform authentication with an existing JWT token instead of a full login to test if things go better, but they will I think ;-) |
Just tested https://www.dailycred.com/article/bcrypt-calculator and this is almost instant result with a 12 factor. |
feathersjs-ecosystem/authentication-local#30 is the pull request and discussion for that. |
So as far as I understand a cost factor of 12 is still used today, this keeps my observations relevant IMHO. |
It looks like it can be random between 12 and 19. You can try removing that code to see if it makes a difference (or implement your own hashing system). Besides bCrypt hashing I am not aware of anything computationally intensive happening anywhere else in Feathers so any delays probably come from there. I never had an application where 30+ people logged in at the exact same time so I'll let you make a call on this by putting up a PR. |
Maybe you did not read my article completely but in my tests the number of logins is far less than that. I have a ramp up phase (e.g. during one minute) where clients progressively connect until we have let's say 60 concurrent clients connected. Then each client call a service and pause during let's say 5s. After a given number of calls a client logouts/disconnects and is replaced by a new one (so we have a login here) in order to maintain the concurrency level during the whole test duration. With my numbers this means that in the worst case we are performing 1 login per second while some others clients might access the service. But this is usually far less than that because each client pause between service calls to simulate a user reading the stuffs, so most clients are connected but inactive. I will try to track the number of login per seconds and given you results back in case I am wrong. I also updated my code to authenticate directly with a JWT token strategy instead of performing a password-based login. Things are better, I am now getting some timeouts starting at 200 concurrent clients instead of 60, but it does not seem to be a so great improvement :-( By the way on the client I now get |
Good news, with JWT I increased the ramp up duration so that there is not more than 1 login per second and I successfully jumped without problem to 1000 concurrent clients on my hardware. It seems there is a barrier around 1/2 logins per second with JWT and less using local login, not sure if this is normal but I have at least improved my knowledge today. I also updated my article to add the JWT based authentication strategy. I would appreciate if someone could provide some benchmark including authentication to see if it looks similar. |
I just ran a REST benchmark including authentication of my performance comparison repo. The stats without were:
Same request with JWT authentication (using a dummy user service):
It's definitely slower but not unreasonably so. So I think the problem is either
|
Thanks for feedback, I also updated my benchmark to work with REST and experienced no problems with it. I extended my article with an example of our staging infrastructure at the end. So far with sockets:
I wonder if there is a particular state causing sockets to be slower (eg an array of connections you have to look into at each new connection, etc.) ? This might also be the cost of the underlying socket library, by the way we use socket.io. |
Are you using the latest Feathers with channels? How is the performance of logins (local authentication) via REST? |
Yes I use Feathers V3 with channels but deactivated it in the test apps to avoid any additional workload. I can share my number for a ramp up duration of 60s until 60 concurrent clients, i.e. one local login per second on average. What I observe is a really strange behavior with authentication. Although REST does not raises any timeout it actually performs a lot worse ! REST with local authentication:
Websockets with local authentication:
With the app without autentication numbers between REST and Websockets are pretty similar. REST without authentication:
Websockets without authentication:
By saying the local authentication problem does not affect REST maybe I has been misguided by the fact the timeout option of the authentication module is not actually taken into account with REST ? So I tried to manage it by myself and bingo, changing the REST client configuration like this REST with local authentication (corrected):
In any case I also share the numbers with JWT authentication. REST with JWT authentication:
Websockets with JWT authentication:
So in short:
|
What I don't understand is why in your tests raw performances are really better. I am not saying my benchmark is free of bugs but I am logging the number of connections and it seems to be consistent. There are two main differences between your benchmark and mine. First I am using Feathers client. |
It would be great to see if those specific issues still persist in the latest version (v4). |
Going to close this since even if it is still the case it is part of how bcrypt is designed. The new authentication system is flexible enough to customize the hashing algorithm by customizing the strategy. |
Steps to reproduce
Create two chat apps using the CLI by following https://docs.feathersjs.com/guides/chat/readme.html, NeDB as a datastore, socket.io as transport, one with local authentication enabled and one without it.
Create a test user in each one. To track the concurrent number of connections simply change the socketio configuration line like this:
We also disabled channels.
Use this benchmark article to perform a workload test of the applications. The following scenario should be run for the authenticated app:
The following scenario should be run for the unauthenticated app:
We run this scenarios with a ramp up duration of 60 seconds and the following configurations:
This means that in the worst case we are performing 1 client authentication per second while others clients access the service. But this is usually less than that because after consulting the service each client will pause during 5s to simulate a user reading the stuffs, so most clients are connected but inactive.
Actual behavior
Everything goes fine with the unauthenticated app, average service response is always under 10 ms.
Until 30 concurrent clients on the authenticated app everything almost goes fine as well, average service response is around 150 ms and average authentication response is around 1 s.
With 60 concurrent clients on the authenticated app almost a quarter of them suffer this error:
Error: Authentication timed out
. Average service response is around 700 ms and average authentication response is around 8 s (for those succeeding to connect).I am not 100% sure about that but, when I wrote the benchmark article to perform a workload test of our production application with Feathers V2 I think we supported something like 500 concurrent connections on the same hardware. So this might be a V3-specific issue. When doing the test again on our production app now migrated to V3 we also noticed a behavior similar to this issue #892, which can be related to timeouts.
What is strange is that increasing the ramp up duration does not help, it seems there is a "processing barrier" at some point that prevent authentication to scale.
Expected behavior
A basic Feathers app should support performing authentication of a large number of users per minute on a decent machine. Of course we could probably make this benchmark work with multiple app instances but it seems to me that 60 concurrent clients are not too much on a "good" machine.
A good test has already been done on websockets but it did not include authentication, it might be interesting to try reproduce this issue on it.
System configuration
Hardware: Core i7 7700HQ 2.8 GHz (4 cores), 16GB RAM
Module versions :
NodeJS version: 8.9
Operating System: Windows
The text was updated successfully, but these errors were encountered: