Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak on Actix 3.2.0 #1780

Closed
Mcfloy opened this issue Nov 8, 2020 · 41 comments
Closed

Memory Leak on Actix 3.2.0 #1780

Mcfloy opened this issue Nov 8, 2020 · 41 comments
Labels
A-web project: actix-web C-perf-mem Category: perfomance / memory / resources

Comments

@Mcfloy
Copy link

Mcfloy commented Nov 8, 2020

I'm creating a simple API and put it under stress using loadtest. The API consumes 3MB when idling and not under stress.
With the command: loadtest -c 1000 --rps 10000 http://localhost:4000/hello/world the memory consumption goes to ~100MB.

Expected Behavior

After the stress test done, the memory consumption goes back to a normal level (~3MB)

Current Behavior

After the stress test done, the memory consumption doesn't decrease.

Possible Solution

Sorry I can't find any solution for this.

Steps to Reproduce (for bugs)

  1. Implement a simple API and call the exposed route.

Context

I tried to make a benchmark of an API made in Rust vs a one in Spring, and while the rust version is memory efficient, it does show some leaks that are problematic.


Code used for exposing the route.

use actix_web::{middleware, web, App, HttpServer, Responder, HttpResponse};

async fn hello_world(
    name: web::Path<String>
) -> impl Responder {
    HttpResponse::Ok().json(format!("{}", name))
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {

    println!("Launching server...");

    std::env::set_var("RUST_LOG", "actix_web=info,actix_server=info");
    env_logger::init();
    dotenv::dotenv().ok();

    let port = match std::env::var("PORT") {
        Ok(port) => port,
        _ => String::from("4000")
    };
    let address = format!("0.0.0.0:{}", port);

    let http_server = HttpServer::new(move || {
        App::new()
            // .wrap(middleware::Logger::default())
            .route("/hello/{name}", web::get().to(hello_world))
    })
        .bind(address.clone())?;

    println!("Will listen to {}", address);

    http_server
        .run()
        .await
}

Using LeakSanitizer it does confirm that there's a little problem:

==68512==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 256 byte(s) in 8 object(s) allocated from:
    #0 0x564cae1e4255  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0xd6255)
    #1 0x564cae2f830b  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0x1ea30b)

Direct leak of 192 byte(s) in 8 object(s) allocated from:
    #0 0x564cae1e4255  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0xd6255)
    #1 0x564cae40bf8b  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0x2fdf8b)

Indirect leak of 8192 byte(s) in 8 object(s) allocated from:
    #0 0x564cae1e4255  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0xd6255)
    #1 0x564cae2f830b  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0x1ea30b)

Indirect leak of 32 byte(s) in 8 object(s) allocated from:
    #0 0x564cae1e4255  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0xd6255)
    #1 0x564cae565a5b  (/home/lperreau/workspace/hello-world/target/x86_64-unknown-linux-gnu/debug/hello_world+0x457a5b)

SUMMARY: LeakSanitizer: 8672 byte(s) leaked in 32 allocation(s).

Your Environment

  • Rust Version (I.e, output of rustc -V): rustc 1.45.1 (c367798cf 2020-07-26)
  • Actix Web Version: 3.2.0
@fakeshadow
Copy link
Contributor

actix-web uses cache internally for certain objects to reduce memory allocation. So the startup memory usage normally would not match what you end up with after a stress test.
Can you try multiple tests to see if the memory usage would keep going up instead of stabilize at a certain range?

@Mcfloy
Copy link
Author

Mcfloy commented Nov 9, 2020

Yep I'll provide several screenshots, I'll use docker stats to show the memory usage at the beginning, after 5 minutes of stress and 10 minutes of rest, doing so a few times.

The API at the beginning:
image

Then I'll do the following command: loadtest -c 1000 --rps 10000 http://localhost:4000/hello/world

After 5 minutes this how the API is handling (still under stress test):
image
This is a stabilized memory usage, it went from 3MB to 110MB very quickly but still managed to handle this, so the cache part does make sense.

Let's take a rest and wait 10 minutes (the API is not under stress):
image

We stress it again for another 5 minutes:
image

And then let it rest for 10 minutes:
image

Now I see that I'm wrong by saying "memory leak" when the API's memory usage is stabilizing while under stress, but I find that odd that the memory usage wouldn't go back to a normal position after some time (cache invalidation maybe ?)

@robjtede robjtede added needs-investigation A-web project: actix-web labels Nov 9, 2020
@fakeshadow
Copy link
Contributor

Well some of the cache are not deallocate until the server is dropped. It means there is no ttl.
The reason behind this is that cache could be used by every incoming request(HttpRequest object for example). There is no point of releasing and then rebuild it again with next incoming request.

If memory consumption is not your first concern then the current design is a good balance between performance and ram usage.
That said I imagine an option to disable some of the cache usage could be useful if there is a strong need for low memory consumption.

@Mcfloy
Copy link
Author

Mcfloy commented Nov 9, 2020

Indeed it's not my first concern and the current design does make sense, but maybe on nano computers it might require low memory consumption but that's not my case so thanks a lot for the explanation.

I'll let you decide whether to close this ticket or not.

@BartWillems
Copy link

BartWillems commented Nov 10, 2020

memory leak

I'm currently experiencing something that looks like a memory leak too.
I see constant increases when new websockets are connected.

It might be because of my own code, I'm going clean up my code a bit and post it later on.

Here is the valgrind summary:

==18921== 
==18921== HEAP SUMMARY:
==18921==     in use at exit: 9,570,322 bytes in 58,310 blocks
==18921==   total heap usage: 227,608 allocs, 169,298 frees, 105,536,882 bytes allocated
==18921== 
==18921== LEAK SUMMARY:
==18921==    definitely lost: 13,128 bytes in 543 blocks
==18921==    indirectly lost: 0 bytes in 0 blocks
==18921==      possibly lost: 1,014,935 bytes in 12,808 blocks
==18921==    still reachable: 8,542,259 bytes in 44,959 blocks
==18921==         suppressed: 0 bytes in 0 blocks
==18921== Rerun with --leak-check=full to see details of leaked memory
==18921== 
==18921== For lists of detected and suppressed errors, rerun with: -s
==18921== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

The definitely lost is always the same amount.
But the possibly lost increases when I open more websockets.

Note: I always have maximum 1 websocket open in these tests, I just open & close them.

Edit: after rewriting a lot of my code, it seems like I don't leak memory anymore.

Either that, or it was indeed just the actix internal cache all along.

no memory leak

@awulkan
Copy link

awulkan commented Nov 27, 2020

@fakeshadow

Well some of the cache are not deallocate until the server is dropped. It means there is no ttl.
The reason behind this is that cache could be used by every incoming request(HttpRequest object for example). There is no point of releasing and then rebuild it again with next incoming request.

I have some questions:

  1. What exactly is it caching? 120MB seems quite excessive for a "hello world" API.
  2. Does it deallocate if the server is running low on RAM? Otherwise it might be starving other applications, or crashing?

@fakeshadow
Copy link
Contributor

fakeshadow commented Nov 27, 2020

  1. Mainly HttpRequest objects and JoinHandle of async tasks. OP calls his API a "hello world" doesn't mean it really is. A typical hello world example would not consume that much amount of memory.
    It's also worth noticing that people doing these kind of tests tend to use unrealistic work generator hitting their apps which result in edge cases of memory usage that are not regular seen in real world.

  2. No. I'm not aware of any system memory check in actix-web. It assume you are running on a machine with considerable amount of memory as actix-web target std rust. It can't run on low power/memory machines that typically require no_std.

@awulkan
Copy link

awulkan commented Nov 27, 2020

@fakeshadow

  1. Mainly HttpRequest objects and JoinHandle of async tasks. OP calls his API a "hello world" doesn't mean it really is. A typical hello world example would not consume that much amount of memory.
    It's also worth noticing that people doing these kind of tests tend to use unrealistic work generator hitting their apps which result in edge cases of memory usage that are not regular seen in real world.

I'm the one who called it a "hello world" API, because that's what the code looks like, but maybe I'm missing something? I'm new to Actix.

  1. No. I'm not aware of any system memory check in actix-web. It assume you are running on a machine with considerable amount of memory as actix-web target std rust. It can't run on low power/memory machines that typically require no_std.

I think a lot of people are using Rust for APIs for the reason of Rust being very resource efficient. Right now it seems like Actix is consuming a lot more RAM than even a .NET 5 Web API does. And the fact that it apparently stays at such high RAM usage forever is a concern to me. If I'm running a few APIs on a server with 500MB RAM they will all keep increasing in RAM usage until the server runs out of memory.

I haven't tested this yet, but does Actix top out at about 120MB, or does it consume a lot more if you have more routes?

@fakeshadow
Copy link
Contributor

I don't have number so you have to test it yourself.
The thing is local work load generator is never a good indicator of how your real work load would be.

For most entry level server your machine physically can't handle that amount of concurrent requests to actually achieve the memory consumption like OP have.

@awulkan
Copy link

awulkan commented Nov 27, 2020

The thing is local work load generator is never a good indicator of how your real work load would be.

True, but I don't think people generally use it to simulate real loads. They use it to simulate worst case scenarios. Such as DOS attacks, or "hug of death" from sites like Reddit. I understand that there might be other bottlenecks in the network or server hardware that prevents such loads from happening in some cases though.

For most entry level server your machine physically can't handle that amount of concurrent requests to actually achieve the memory consumption like OP have.

I guess I'll have to test it somehow. I don't know enough about Actix to understand if the same problem applies if you only have for example 10 requests per second during a long period of time.

It does bother me though that such a small API was able to allocate 120MB permanently, until you manually shut down the server.

@awulkan
Copy link

awulkan commented Nov 27, 2020

Alright I've done some test with a similar hello world app.

use actix_web::{get, web, App, HttpServer, Responder};

#[get("/{name}")]
async fn index(web::Path(name): web::Path<String>) -> impl Responder {
    format!("Hello {}", name)
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| App::new()
        .service(index))
        .bind("127.0.0.1:8080")?
        .run()
        .await
}

I used Bombardier for testing on a PC with an Intel i5 3570k (4c/4t) CPU and 16GB DDR3 RAM.

Low concurrency, over 3m 46s.
.\bombardier-windows-amd64.exe -k -c 50 -n 10000000 http://localhost:8080/asd
Start RAM: 3.1MB
End RAM: 3.8MB
Peak: 5.4MB

High concurrency over 4m 6s.
.\bombardier-windows-amd64.exe -k -c 1000 -n 10000000 http://localhost:8080/asd
Start RAM: 3.1MB
End RAM: 8.1MB
Peak: 46MB

I didn't get the massive 120MB RAM usage that McFloy got, but at the same time the server did not return to the original RAM usage. I'll post results from a more complex API when I've actually built my application.

@AaronErhardt
Copy link

I've tested this issue a bit with actix. As awulkan noted the end RAM consumption can vary significantly when stressing the server in different tests. This doesn't seem to follow a simple logic, running a small load test might lead to higher RAM usage after the test than a large load test with more requests or vice versa. Even running a second, identical test can lower the RAM consumption. However the RAM usage never climbs up more than about 5 MB after my tests. For most workloads this is ok but still caching could be more efficient because it's sending the same response over and over. But maybe that makes sense, I don't know the internal chaching mechanism. Of course this isn't a simple "hello world" project but even in an actual "hello world" that just servers a static response I could measure similar behaviour.

The good news in this all is that this behaviour is not part of the detected memory leak. The bad news is that actix tends to leak a small amount of memory independently of what you do. In my case around 8000 bytes were leaked no matter whether I sent requests or not.

@fakeshadow
Copy link
Contributor

#1889
With this PR all Box::leak would be removed from actix-web. When it's merged there will not be any intentional leaked memory.

It does not fix the memory usage issue but would reduce some noise when profiling the memory.

@fakeshadow
Copy link
Contributor

fakeshadow commented Jan 25, 2021

#1929

This PR tries to address the memory usage problem related to this issue.
It would be appreciated you guys can help out by test it and give a report on if it reduce your memory foot print under heavy workload.

I'm not certain if it's the cause and a proper fix for the problem so please don't get your hope high.
And you would still encounter small amount of memory bloat as the cache is still there working. I considered it working as intended and could be possibly addressed in future.

@therealprof
Copy link
Contributor

It would be appreciated you guys can help out by test it and give a report on if it reduce your memory foot print under heavy workload.

I can confirm that the latest betas behave a lot better than the latest stable wrt to memory consumption.

@fakeshadow
Copy link
Contributor

fakeshadow commented Feb 22, 2021

I can confirm that the latest betas behave a lot better than the latest stable wrt to memory consumption.

Thanks for the feed back. It matches what I observe from my projects. The reduced default buffer size reduce the memory consumption by a lot on light requests.

That said for heavy request/response that are large in size the app could still consume as much as actix-web v3 does.

@therealprof
Copy link
Contributor

Actually what we're seeing is slightly different. We have an architecture where thousands of WS clients are persistently connected and additionally a REST service allowing other systems to talk to those systems. When we run a benchmark over that REST interface to talk to the devices in v3 we're seeing skyrocketing memory consumption which doesn't really recover, in beta it drops back to a (rather high) plateau after the load is reduced.

@fakeshadow
Copy link
Contributor

Actually what we're seeing is slightly different. We have an architecture where thousands of WS clients are persistently connected and additionally a REST service allowing other systems to talk to those systems. When we run a benchmark over that REST interface to talk to the devices in v3 we're seeing skyrocketing memory consumption which doesn't really recover, in beta it drops back to a (rather high) plateau after the load is reduced.

It would be appreciated if there can be some example to simulate your use case for profiling the memory usage better. I have some clue on how to further reduce the memory footprint and more cases to study would help.

@therealprof
Copy link
Contributor

Maybe we can spin something based on the WS chat example... @thalesfragoso any chance you could provide such a testcase?

@thalesfragoso
Copy link
Member

When we run a benchmark over that REST interface to talk to the devices in v3 we're seeing skyrocketing memory consumption which doesn't really recover, in beta it drops back to a (rather high) plateau after the load is reduced.

Even before running the benchmark I could see a constantly increase in memory usage just by doing rounds of connecting and disconnecting several (thousands) WS clients. In the betas the memory usage stabilizes after a few rounds of connecting/disconnecting.

I can try to reproduce the behavior in a smaller, self-contained testcase.

@thalesfragoso
Copy link
Member

@fakeshadow I created a small stress test based on the websockets example. It just connects a bunch of clients and then disconnects them and repeats...

Here on my setup I got the following results using 10000 clients per round:

  • Current release: Memory seems to keep increasing indefinitively with each round, after ten rounds it sits at around 807MB.
  • Betas: Memory usage seems to stabilize at around 434MB on round ~7, staying around that value for the rest of the test (20 rounds total).

Another thing that I noticed is that the test is way faster on the betas, it actually takes a considerable amount of time to run 10 rounds on the current release.

@fakeshadow
Copy link
Contributor

@thalesfragoso Sorry for the late reply. This comment somehow slip through my notification.
I would look into the test code can try to figure out how we can bring down the memory footprint

@fakeshadow
Copy link
Contributor

@thalesfragoso After some test I find the awc client is making up near half of the memory usage. With latest master branch your server memory usage lands at 250mb with 10 thousand connection and 20 rounds for me. awc would use 170mb.

The issue is both in server and client but at the same time server does not take that much memory as I imagine. The lack of recycle memory is still there.

@therealprof
Copy link
Contributor

The issue is both in server and client but at the same time server does not take that much memory as I imagine. The lack of recycle memory is still there.

Well, in our server product we seem to be using around 20kB per established connection (without TLS) which is okay but not great if you want to scale to millions. I was certainly hoping for a magnitude less, especially since TLS adds a bunch on top.

@fakeshadow
Copy link
Contributor

The issue is both in server and client but at the same time server does not take that much memory as I imagine. The lack of recycle memory is still there.

Well, in our server product we seem to be using around 20kB per established connection (without TLS) which is okay but not great if you want to scale to millions. I was certainly hoping for a magnitude less, especially since TLS adds a bunch on top.

Yea I'm looking into the memory footprint. It's just an observation from my part on that test code and I was suggesting it can be separate into two bins for a better view of the problem.

@therealprof
Copy link
Contributor

Yea I'm looking into the memory footprint. It's just an observation from my part on that test code and I was suggesting it can be separate into two bins for a better view of the problem.

Yes, there're certainly multiple effects in play which should be seen and dealt with seperately.

@thalesfragoso
Copy link
Member

I separated the binaries as suggested.

@fakeshadow
Copy link
Contributor

fakeshadow commented Mar 3, 2021

@thalesfragoso @therealprof

[dependencies]
actix = { version = "0.11.0-beta.3" }
actix-codec = "0.4.0-beta.1"
actix-web = { git = "https://github.com/actix/actix-web.git", default-features = false, branch = "opt/no_pre_alloc" }
actix-web-actors = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
awc = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }

anyhow = "1.0.38"
log = "0.4.14"
futures = "0.3.12"
structopt = "0.3"
pretty_env_logger = "0.4"

[patch.crates-io]
actix-web= { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
actix-http = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
awc = { git = "https://github.com/actix/actix-web.git", branch = "opt/no_pre_alloc" }
actix-service = { git = "https://github.com/actix/actix-net.git" }

With this branch of actix-web I'm seeing 80mb memory usage reduce on the server side(10000 connections). And the only thing it does is to not pre allocate 8kb wirte buffer for every connection and memory amount does add up.

From this we can see once we allocate space with Bytes it would try to reuse the memory rather than deallocate directly. This does make sense if we think about it. Hense the not recycling happen.

It also worth to notice that this reduce in memory does not translate to real world. Once you start to read/write more in every connection your memory usage would still goes up. The best we can do is to reduce usage of Bytes when we can and aviod extra copy between them. But at last the final memory footprint would still close to how much you use it.

@thalesfragoso
Copy link
Member

@fakeshadow Thanks for your work, I tested the branch and I'm also seeing a considerable reduction in memory usage in our application even after exchanging a few small messages (which is probably due to the smaller write buffer).

I skimmed through the changed code a little, and noticed that the read buffer still gets pushed to 8kB in the reads if its remaining capacity is less than 1kB. I get that it can make performance better, but when we're talking about hundreds of thousand of clients exchanging small messages that becomes a bad deal. It would be good if we could support this use case better.

Just a small question, BytesMut doesn't seem to ever give up memory, right ? So we would always be holding the peak memory usage (on write and read buffers) even long after hitting them, is that correct ?

Thanks again.

@fakeshadow
Copy link
Contributor

fakeshadow commented Mar 5, 2021

I skimmed through the changed code a little, and noticed that the read buffer still gets pushed to 8kB in the reads if its remaining capacity is less than 1kB. I get that it can make performance better, but when we're talking about hundreds of thousand of clients exchanging small messages that becomes a bad deal. It would be good if we could support this use case better.

Yes. The read buffer still get pre allocate when first read happen. This is mostly a perf choice because Bytes crate suggest doing pre allocate instead of frequentely on the fly would reduce overhead in normal cases.

I believe there is strong need for smaller buffer(or bigger ) and it's in the plan to make the buffer size changeable with a public API but it may not work into first v4 release.
It's also possible to use a dynamic scale buffer size like hyper does but it's still an idea.

Just a small question, BytesMut doesn't seem to ever give up memory, right ? So we would always be holding the peak memory usage (on write and read buffers) even long after hitting them, is that correct ?

I don't know enough of Bytes crate so I'm not sure if this is working as intended or something actix-web does prevent it from freeing up the meory. I may need a deeper investigation on this to be sure.

That said from what I observe with the PRs in this issue the buffer size indeed impact the memory footprint so the source could highly related to it.

Thanks again.

You are welcome.

@therealprof
Copy link
Contributor

I don't know enough of Bytes crate so I'm not sure if this is working as intended or something actix-web does prevent it from freeing up the meory. I may need a deeper investigation on this to be sure.

In my reading of the code it only resets the high water mark but newer drops any of the backing buffer, so it might grow but it will never shrink again. Shouldn't be too much of a problem for usual REST interfaces but for longlasting WS connections this doesn't seem ideal.

@fakeshadow
Copy link
Contributor

It's possible to adopt some type of vector buffer to replace the BytesMut as write buffer and reduce one source of copy and memory usage like other crates do. This would be a major breaking change as the current Enocde/Decode traits actix-web use are from tokio-util that asks for a BytesMut input.

@therealprof
Copy link
Contributor

BytesMut seems to use a Vec as buffer. I'm not actually worried about copies at all, I'm much more worried about DoS by exploiting the unbounded memory use. TBH, I'd probably rather drop the BytesMut and create a new one (or implement actual truncation in bytes) after processing of a complete WS message.

@fakeshadow
Copy link
Contributor

Right now there is an extra copy between the response payload and tcp stream and it's done through a BytesMut.
Extra memory bloat are mostly come from this.

@therealprof
Copy link
Contributor

I'm not sure I follow. Extra copies are overhead, sure, but how can they cause bloat? Buffers which only grow but never shrink for persistent connections are a completely different quality of problem: If you have a million devices persistently connected which are sending mostly small keepalive messages but every now and then they send a large amount of data and the buffer stays at that size you'll soon have a proper DoS at your hand.

I have zero worries about extra copies to a temporary buffer which is dropped after processing the data.

@fakeshadow
Copy link
Contributor

fakeshadow commented Mar 6, 2021

Memory allocated by BytesMut is not recovered even they are dropped after keep alive expire.

The problem is not actix-web does not drop or keep reuse the buffer. The problem is drop does nothing.
This impact every h1 connection. I'm talking about a general solution when say reduce use of BytesMut rather than specific to websocket.

@therealprof
Copy link
Contributor

Memory allocated by BytesMut is not recovered even they are dropped after keep alive expire.

I don't think that's correct. If you drop the structure, the backing store is dropped as well and you should get the memory back. Only draining/truncating/clearing splitting won't do anything but to change the len variable. Since we have a mut BytesMut we'll look into whether it's possible to replace it with a fresh BytesMut after we processed the data but it would be great to have a general solution.

This impact every h1 connection. I'm talking about a general solution when say reduce use of BytesMut rather than specific to websocket.

h1 is used in a completely different way than ws so it might require a different solution.

@fakeshadow
Copy link
Contributor

fakeshadow commented Mar 6, 2021

actix-web-actors is on top of h1. So it shares the same problem.
It's also why there is extra memory footprint by using it. As actors also hold extra BytesMut for decode/encode h1 payload.
If you use upgrade handler of HttpService you would see much less memory footprint with ws.

The problem is drop BytesMut does not free the memory and it's why this issue exsits for most part.

To be clear I'm not saying the problem is bytes crate. It's in actix-web's case the h1 dispatcher drop does not reclaim memory reserved or allocated to the bytesmut it owns.

@thalesfragoso
Copy link
Member

The problem is drop BytesMut does not free the memory and it's why this issue exsits for most part.

It should free the memory if it isn't in the shared state, or in the shared state when it's the last ref alive to the underlying buffer. Maybe we don't see the reduction at the process level due to how the allocator works ?

@fakeshadow I've been looking at the bytes's code to try to understand it better. It seems to me that BytesMut will avoid releasing memory if it isn't in the shared state (it will always consume the peak memory until the drop), but when in the shared state it will give up the old memory and allocate a new Vec when it doesn't have enough capacity while reserving more memory. It goes to the shared state in some occasion, like when using split, split_to, etc. The read_buffer in the Dispatcher seems to be subjected to a lot of these operations, so that might be solved already (?). The write_buffer, on the other hand, seems to only use advance, so it would always stay at peak consumption.

I think a good start point would be to try to reduce that 8kB allocation for the read_buffer ? That would improve the case where we have a big amount of live connections. After that, we could try to look for good points to drop/split the write_buffer.

@fakeshadow
Copy link
Contributor

It seems to me that BytesMut will avoid releasing memory if it isn't in the shared state (it will always consume the peak memory until the drop)

My most concern is about that drop does not dealloc memory. Like my previous post this issue is more about websocket and live connections and affect everyone use h1. Reduce peak memory footprint is important too but it should come after.

I think a good start point would be to try to reduce that 8kB allocation for the read_buffer ? That would improve the case where we have a big amount of live connections. After that, we could try to look for good points to drop/split the write_buffer.

8kb is already a big cut down from the 32kb water mark that is used in v3 and we should take notice that people use h1 differently and people can have big request body. Therefore I don't believe another straight up cut on the reserve size is a good option. I would prefer a public API for change the size and/or a dynamic buffer reserve strategy.

@therealprof
Copy link
Contributor

8kb is already a big cut down from the 32kb water mark that is used in v3 and we should take notice that people use h1 differently and people can have big request body. Therefore I don't believe another straight up cut on the reserve size is a good option. I would prefer a public API for change the size and/or a dynamic buffer reserve strategy.

I think it doesn't make sense to allocate a largish buffer if it is not certain that you'll need it. If you have sizeable request bodies you'll typically need to resize anyway (independently on whether the buffer is 1kB, 8kB or 32kB) but in every other case you're always loosing... I wouldn't mind having an API which allows to control the buffer, or to pre-size it in case you know you'll need more (to avoid potential multiple resizes) but reducing the initial size seems like a obvious and trouble free win to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-web project: actix-web C-perf-mem Category: perfomance / memory / resources
Projects
None yet
Development

No branches or pull requests

8 participants