-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Leak on Actix 3.2.0 #1780
Comments
actix-web uses cache internally for certain objects to reduce memory allocation. So the startup memory usage normally would not match what you end up with after a stress test. |
Well some of the cache are not deallocate until the server is dropped. It means there is no ttl. If memory consumption is not your first concern then the current design is a good balance between performance and ram usage. |
Indeed it's not my first concern and the current design does make sense, but maybe on nano computers it might require low memory consumption but that's not my case so thanks a lot for the explanation. I'll let you decide whether to close this ticket or not. |
I'm currently experiencing something that looks like a memory leak too. It might be because of my own code, I'm going clean up my code a bit and post it later on. Here is the valgrind summary:
The Note: I always have maximum 1 websocket open in these tests, I just open & close them. Edit: after rewriting a lot of my code, it seems like I don't leak memory anymore. Either that, or it was indeed just the actix internal cache all along. |
I have some questions:
|
|
I'm the one who called it a "hello world" API, because that's what the code looks like, but maybe I'm missing something? I'm new to Actix.
I think a lot of people are using Rust for APIs for the reason of Rust being very resource efficient. Right now it seems like Actix is consuming a lot more RAM than even a .NET 5 Web API does. And the fact that it apparently stays at such high RAM usage forever is a concern to me. If I'm running a few APIs on a server with 500MB RAM they will all keep increasing in RAM usage until the server runs out of memory. I haven't tested this yet, but does Actix top out at about 120MB, or does it consume a lot more if you have more routes? |
I don't have number so you have to test it yourself. For most entry level server your machine physically can't handle that amount of concurrent requests to actually achieve the memory consumption like OP have. |
True, but I don't think people generally use it to simulate real loads. They use it to simulate worst case scenarios. Such as DOS attacks, or "hug of death" from sites like Reddit. I understand that there might be other bottlenecks in the network or server hardware that prevents such loads from happening in some cases though.
I guess I'll have to test it somehow. I don't know enough about Actix to understand if the same problem applies if you only have for example 10 requests per second during a long period of time. It does bother me though that such a small API was able to allocate 120MB permanently, until you manually shut down the server. |
Alright I've done some test with a similar hello world app. use actix_web::{get, web, App, HttpServer, Responder};
#[get("/{name}")]
async fn index(web::Path(name): web::Path<String>) -> impl Responder {
format!("Hello {}", name)
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new()
.service(index))
.bind("127.0.0.1:8080")?
.run()
.await
} I used Bombardier for testing on a PC with an Intel i5 3570k (4c/4t) CPU and 16GB DDR3 RAM. Low concurrency, over 3m 46s. High concurrency over 4m 6s. I didn't get the massive 120MB RAM usage that McFloy got, but at the same time the server did not return to the original RAM usage. I'll post results from a more complex API when I've actually built my application. |
I've tested this issue a bit with actix. As awulkan noted the end RAM consumption can vary significantly when stressing the server in different tests. This doesn't seem to follow a simple logic, running a small load test might lead to higher RAM usage after the test than a large load test with more requests or vice versa. Even running a second, identical test can lower the RAM consumption. However the RAM usage never climbs up more than about 5 MB after my tests. For most workloads this is ok but still caching could be more efficient because it's sending the same response over and over. But maybe that makes sense, I don't know the internal chaching mechanism. Of course this isn't a simple "hello world" project but even in an actual "hello world" that just servers a static response I could measure similar behaviour. The good news in this all is that this behaviour is not part of the detected memory leak. The bad news is that actix tends to leak a small amount of memory independently of what you do. In my case around 8000 bytes were leaked no matter whether I sent requests or not. |
#1889 It does not fix the memory usage issue but would reduce some noise when profiling the memory. |
This PR tries to address the memory usage problem related to this issue. I'm not certain if it's the cause and a proper fix for the problem so please don't get your hope high. |
I can confirm that the latest betas behave a lot better than the latest stable wrt to memory consumption. |
Thanks for the feed back. It matches what I observe from my projects. The reduced default buffer size reduce the memory consumption by a lot on light requests. That said for heavy request/response that are large in size the app could still consume as much as actix-web v3 does. |
Actually what we're seeing is slightly different. We have an architecture where thousands of WS clients are persistently connected and additionally a REST service allowing other systems to talk to those systems. When we run a benchmark over that REST interface to talk to the devices in v3 we're seeing skyrocketing memory consumption which doesn't really recover, in beta it drops back to a (rather high) plateau after the load is reduced. |
It would be appreciated if there can be some example to simulate your use case for profiling the memory usage better. I have some clue on how to further reduce the memory footprint and more cases to study would help. |
Maybe we can spin something based on the WS chat example... @thalesfragoso any chance you could provide such a testcase? |
Even before running the benchmark I could see a constantly increase in memory usage just by doing rounds of connecting and disconnecting several (thousands) WS clients. In the betas the memory usage stabilizes after a few rounds of connecting/disconnecting. I can try to reproduce the behavior in a smaller, self-contained testcase. |
@fakeshadow I created a small stress test based on the websockets example. It just connects a bunch of clients and then disconnects them and repeats... Here on my setup I got the following results using 10000 clients per round:
Another thing that I noticed is that the test is way faster on the betas, it actually takes a considerable amount of time to run 10 rounds on the current release. |
@thalesfragoso Sorry for the late reply. This comment somehow slip through my notification. |
@thalesfragoso After some test I find the awc client is making up near half of the memory usage. With latest master branch your server memory usage lands at 250mb with 10 thousand connection and 20 rounds for me. awc would use 170mb. The issue is both in server and client but at the same time server does not take that much memory as I imagine. The lack of recycle memory is still there. |
Well, in our server product we seem to be using around 20kB per established connection (without TLS) which is okay but not great if you want to scale to millions. I was certainly hoping for a magnitude less, especially since TLS adds a bunch on top. |
Yea I'm looking into the memory footprint. It's just an observation from my part on that test code and I was suggesting it can be separate into two bins for a better view of the problem. |
Yes, there're certainly multiple effects in play which should be seen and dealt with seperately. |
I separated the binaries as suggested. |
[dependencies]
actix = { version = "0.11.0-beta.3" }
actix-codec = "0.4.0-beta.1"
actix-web = { git = "https://github.com/actix/actix-web.git", default-features = false, branch = "opt/no_pre_alloc" }
actix-web-actors = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
awc = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
anyhow = "1.0.38"
log = "0.4.14"
futures = "0.3.12"
structopt = "0.3"
pretty_env_logger = "0.4"
[patch.crates-io]
actix-web= { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
actix-http = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
awc = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
actix-service = { git = "https://github.com/actix/actix-net.git" } With this branch of actix-web I'm seeing 80mb memory usage reduce on the server side(10000 connections). And the only thing it does is to not pre allocate 8kb wirte buffer for every connection and memory amount does add up. From this we can see once we allocate space with It also worth to notice that this reduce in memory does not translate to real world. Once you start to read/write more in every connection your memory usage would still goes up. The best we can do is to reduce usage of Bytes when we can and aviod extra copy between them. But at last the final memory footprint would still close to how much you use it. |
@fakeshadow Thanks for your work, I tested the branch and I'm also seeing a considerable reduction in memory usage in our application even after exchanging a few small messages (which is probably due to the smaller write buffer). I skimmed through the changed code a little, and noticed that the read buffer still gets pushed to 8kB in the reads if its remaining capacity is less than 1kB. I get that it can make performance better, but when we're talking about hundreds of thousand of clients exchanging small messages that becomes a bad deal. It would be good if we could support this use case better. Just a small question, Thanks again. |
Yes. The read buffer still get pre allocate when first read happen. This is mostly a perf choice because I believe there is strong need for smaller buffer(or bigger ) and it's in the plan to make the buffer size changeable with a public API but it may not work into first v4 release.
I don't know enough of That said from what I observe with the PRs in this issue the buffer size indeed impact the memory footprint so the source could highly related to it.
You are welcome. |
In my reading of the code it only resets the high water mark but newer drops any of the backing buffer, so it might grow but it will never shrink again. Shouldn't be too much of a problem for usual REST interfaces but for longlasting WS connections this doesn't seem ideal. |
It's possible to adopt some type of vector buffer to replace the |
|
Right now there is an extra copy between the response payload and tcp stream and it's done through a |
I'm not sure I follow. Extra copies are overhead, sure, but how can they cause bloat? Buffers which only grow but never shrink for persistent connections are a completely different quality of problem: If you have a million devices persistently connected which are sending mostly small keepalive messages but every now and then they send a large amount of data and the buffer stays at that size you'll soon have a proper DoS at your hand. I have zero worries about extra copies to a temporary buffer which is dropped after processing the data. |
Memory allocated by The problem is not actix-web does not drop or keep reuse the buffer. The problem is drop does nothing. |
I don't think that's correct. If you drop the structure, the backing store is dropped as well and you should get the memory back. Only draining/truncating/clearing splitting won't do anything but to change the
|
actix-web-actors is on top of h1. So it shares the same problem. The problem is drop To be clear I'm not saying the problem is bytes crate. It's in actix-web's case the h1 dispatcher drop does not reclaim memory reserved or allocated to the bytesmut it owns. |
It should free the memory if it isn't in the shared state, or in the shared state when it's the last ref alive to the underlying buffer. Maybe we don't see the reduction at the process level due to how the allocator works ? @fakeshadow I've been looking at the I think a good start point would be to try to reduce that 8kB allocation for the |
My most concern is about that drop does not dealloc memory. Like my previous post this issue is more about websocket and live connections and affect everyone use h1. Reduce peak memory footprint is important too but it should come after.
8kb is already a big cut down from the 32kb water mark that is used in v3 and we should take notice that people use h1 differently and people can have big request body. Therefore I don't believe another straight up cut on the reserve size is a good option. I would prefer a public API for change the size and/or a dynamic buffer reserve strategy. |
I think it doesn't make sense to allocate a largish buffer if it is not certain that you'll need it. If you have sizeable request bodies you'll typically need to resize anyway (independently on whether the buffer is 1kB, 8kB or 32kB) but in every other case you're always loosing... I wouldn't mind having an API which allows to control the buffer, or to pre-size it in case you know you'll need more (to avoid potential multiple resizes) but reducing the initial size seems like a obvious and trouble free win to me. |
I'm creating a simple API and put it under stress using
loadtest
. The API consumes 3MB when idling and not under stress.With the command:
loadtest -c 1000 --rps 10000 http://localhost:4000/hello/world
the memory consumption goes to ~100MB.Expected Behavior
After the stress test done, the memory consumption goes back to a normal level (~3MB)
Current Behavior
After the stress test done, the memory consumption doesn't decrease.
Possible Solution
Sorry I can't find any solution for this.
Steps to Reproduce (for bugs)
Context
I tried to make a benchmark of an API made in Rust vs a one in Spring, and while the rust version is memory efficient, it does show some leaks that are problematic.
Code used for exposing the route.
Using LeakSanitizer it does confirm that there's a little problem:
Your Environment
rustc -V
):rustc 1.45.1 (c367798cf 2020-07-26)
3.2.0
The text was updated successfully, but these errors were encountered: