Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible memory leak in server integrations, origins unclear #590

Closed
Zk2u opened this issue Feb 27, 2023 · 20 comments
Closed

possible memory leak in server integrations, origins unclear #590

Zk2u opened this issue Feb 27, 2023 · 20 comments
Labels
question Further information is requested

Comments

@Zk2u
Copy link
Contributor

Zk2u commented Feb 27, 2023

Context: #581

Removed all API use in valeralabs/web and hardcoded a user profile into users.rs. Running a stress test for 120 seconds:

Concurrency Level:      10
Time taken for tests:   120.001 seconds
Complete requests:      904656
Failed requests:        0
Total transferred:      4358651880 bytes
HTML transferred:       4260043940 bytes
Requests per second:    7538.75 [#/sec] (mean)
Time per request:       1.326 [ms] (mean)
Time per request:       0.133 [ms] (mean, across all concurrent requests)
Transfer rate:          35470.55 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       2
Processing:     0    1   0.5      1      23
Waiting:        0    1   0.5      1      23
Total:          0    1   0.5      1      23

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      2
  80%      2
  90%      2
  95%      2
  98%      2
  99%      3
 100%     23 (longest request)

The memory usage of the web server increased from ~5MB to 357MB, and did not decrease after the benchmark. This may be due to #529 somewhere. I'm not sure where this issue is coming from yet.

@Zk2u
Copy link
Contributor Author

Zk2u commented Feb 27, 2023

Tested on the hackernews example with Actix using regular serde. 3190 requests increased memory usage from 5MB to 17.5MB (again, not decreasing after).

@gbj
Copy link
Collaborator

gbj commented Feb 27, 2023

Do you have a reason to believe this is coming from within Leptos in particular? In other words, if you run the same stress test in Actix with the same number of requests but serve some static or generated HTML instead of running it through Leptos, what are the results?

@gbj
Copy link
Collaborator

gbj commented Feb 27, 2023

This conversation may also provide a helpful frame of reference.

See also this GitHub issue and discussion.

@Zk2u
Copy link
Contributor Author

Zk2u commented Feb 27, 2023

The first test calls another Actix server, our API, on every request (1:1 with leptos). The API instance remained at 4.8MB throughout the test, but the Leptos server grew to 977MB (check the latest reply on #581).

@gbj
Copy link
Collaborator

gbj commented Feb 28, 2023

I'm happy to take a look at this locally to try to figure it out, if you can give me some steps you did to do the stress-testing. This is an area I'm pretty unfamiliar with.

@gbj gbj added the question Further information is requested label Feb 28, 2023
@Zk2u Zk2u changed the title Leptos is leaking memory on Actix ⚠️ Leptos is potentially leaking memory on Actix ⚠️ Feb 28, 2023
@Zk2u
Copy link
Contributor Author

Zk2u commented Feb 28, 2023

I'm using ApacheBench - if you're on a MacBook (I think...), it should be preinstalled. The command I'm using is:

ab -n 15000 -c 10 http://127.0.0.1:3000/

Zipped test app. Run with cargo leptos watch -r

@gbj
Copy link
Collaborator

gbj commented Feb 28, 2023

Thanks. I ran this against the todo apps to avoid spamming the 3rd-party hackernews API I use, and was able to reproduce on a smaller scale. Interestingly Axum seems to do a better job of shrinking its memory footprint back down between bursts of requests but I agree there's probably a Leptos leak somewhere.

I checked and can confirm that the Runtime structs are being dropped for each request, and the RUNTIMES slotmap capacity is staying stable, both of which which are very good (although it would have been an easy fix).

I think the same behavior appears with very simple Leptos examples based on some limiting testing. I'm not very familiar with tools to trace this kind of memory leak so I'd appreciate any help.

@gbj
Copy link
Collaborator

gbj commented Feb 28, 2023

Okay I tested this against a route that just uses render_to_string and found completely flat memory use, which is great. So it's something inside the render_to_stream implementation, which makes perfect sense to me.

@Zk2u
Copy link
Contributor Author

Zk2u commented Feb 28, 2023

A similar thing may have occurred in #103 (adding for potential extra context).

@Zk2u Zk2u changed the title Leptos is potentially leaking memory on Actix ⚠️ leptos_actix::render_app_to_stream_with_context is leaking memory on Actix Feb 28, 2023
@gbj
Copy link
Collaborator

gbj commented Feb 28, 2023

A similar thing may have occurred in #103 (adding for potential extra context).

Thanks — That particular issue was before I implemented proper management for runtimes, which I've confirmed are now being disposed properly, so it's a red herring I think. But having narrowed it down to the streaming implementation is very helpful, and I can dig in a little more later.

@gbj gbj changed the title leptos_actix::render_app_to_stream_with_context is leaking memory on Actix leptos_dom::ssr::render_to_stream is leaking memory Mar 1, 2023
@gbj
Copy link
Collaborator

gbj commented Mar 1, 2023

Edited the title, if you don't mind. I was able to reproduce with this very simple main.rs, which makes it much easier for me to test.

use leptos::{ssr::render_to_stream, view, IntoView};
use futures::StreamExt;

#[tokio::main]
async fn main() {
    loop {
        let mut stream = Box::pin(render_to_stream(|cx| view! { cx,
            <div>
                <p>"Paragraph 1."</p>
                <p>"Paragraph 2."</p>
                <p>"Paragraph 3."</p>
            </div>
        }.into_view(cx)));
        while let Some(chunk) = stream.next().await {
            println!("{chunk}");
        }
    }
}

@gbj
Copy link
Collaborator

gbj commented Mar 1, 2023

Oh no, I've been a very naughty boy.

It turns out if you write a function called run_scope_undisposed and you write in the docs, "If you do not dispose of the scope on your own, memory will leak" then... if you do not dispose of the scope on your own, memory will leak. To be honest I thought those would be cleaned up when disposing of the runtime, but apparently I thought wrong.

So, I've managed to fix the memory leak I created in my example above. I'll need to do a little more testing to make sure I've got it figured out in the Actix integration case as well. I'm out of time this morning but should be able to finish this tonight, I'd think.

@Zk2u
Copy link
Contributor Author

Zk2u commented Mar 1, 2023

That made me laugh. Nice job. Let me know when you have, and I’ll rerun the initial stress test to confirm.

@gbj gbj changed the title leptos_dom::ssr::render_to_stream is leaking memory create_resource is leaking memory Mar 1, 2023
@gbj gbj changed the title create_resource is leaking memory render_to_stream and create_resource leak memory Mar 1, 2023
@gbj
Copy link
Collaborator

gbj commented Mar 1, 2023

I was actually able to reproduce the memory leak that I think is causing most of this problem.

It has to do with the interaction between spawning a large number of Tokio tasks and memory allocators not returning memory to the system.

I could toggle the memory leak on and off by rendering an app with a single create_resource, and then toggling a tokio::task::spawn_local on and off within leptos::spawn_local, which is used to load async resources.

It's entirely possible I'm using Tokio wrong here and there's an easy solution. But see also discussion in several Tokio issues:

tokio-rs/tokio#4406

tokio-rs/tokio#4321 (comment)

@Zk2u
Copy link
Contributor Author

Zk2u commented Mar 1, 2023

So this isn't actually a “leak”, but rather the allocator holding onto the memory after it's finished handling any requests as it's faster to reuse?

What happens when we run it in a VM with a limited amount of memory?

@gbj
Copy link
Collaborator

gbj commented Mar 1, 2023

Nope I was wrong. It was clearly a real leak, the longer I ran it. I have managed to create a streaming, non-leaking version, so I should be able to get those changes integrated into the Actix and Axum pieces I think.

@gbj
Copy link
Collaborator

gbj commented Mar 2, 2023

...Okay, sorry for all the back and forth here. I've managed to create several different memory leaks during my experiments here, usually due to my doing something in a relatively stupid way. (For example, my "it was clearly a real leak" above was due to a test using a loop over synchronous render_to_string, so spawning a basically infinite number of async resources outside a context in which they were being handled properly.)

Importantly: when I isolate the largest possible chunks of Leptos code from the Axum or Actix integrations—literally everything that generates the streaming streaming—and run it in a very tight loop, logging the streams of HTML out to the console, with no Actix or Axum server involved, I find that I am unable to produce any kind of memory leak. They stay at a stable and fairly low memory use (about 5MB in my Actix).

I'll merge my fix to render_to_stream, which was a real memory leak but was actually not being used directly in the integrations, I don't think. I guess I'd recommend trying your test again once I've merged that just in case. Important: I'd recommend letting it run for a long time and seeing if it ever plateaus, or if it's continuous, linear growth.

And maybe see if using jemalloc does anything to the memory use, which was recommended in several places as returning memory to the OS more aggressively.

To use jemalloc, I think you can just add it to Cargo.toml

[target.'cfg(not(target_env = "msvc"))'.dependencies]
jemallocator = "0.5"

and then in main.rs

#[cfg(not(target_env = "msvc"))]
use jemallocator::Jemalloc;

#[cfg(not(target_env = "msvc"))]
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;

Beyond that, I'm not really sure what to say. I've spent a lot of time testing all the different combinations and I'm unable to produce a memory leak from within Leptos's code... If you're able to produce a reproduction of a leak that's using Leptos in a plain binary, independent of Actix or Axum, I'd be really happy to take a look, but I don't think there's anything else I can do at this point.

EDIT: Oops managed to click the wrong button and close and reopen this. Thanks, GitHub.

@gbj gbj closed this as completed Mar 2, 2023
@gbj gbj reopened this Mar 2, 2023
@gbj gbj changed the title render_to_stream and create_resource leak memory possible memory leak in server integrations, origins unclear Mar 2, 2023
@Zk2u
Copy link
Contributor Author

Zk2u commented Mar 2, 2023

I'm going to have a go at containerising the server, limiting its memory and then stressing it and seeing if it runs out of memory or not. I will get back if this is an actual issue with leptos.

I am curious that it keeps increasing, however, since I’d expect it to plateau at some point and start reusing allocated memory. The RPS doesn't change throughout the bench in a major way.

@Zk2u
Copy link
Contributor Author

Zk2u commented Mar 2, 2023

I managed to containerise the server and... boom. That confirms it. There isn't a memory leak, but the allocator is not returning the memory to the system.

Again - continuous benchmark as before for 10 minutes. The web server's container was limited to using 32 MB of memory, and the server had no issues staying within that. So, I'm just marking this as done.

Edit: I'm interested in what those drops are and what controls them...

image
image

@Zk2u Zk2u closed this as completed Mar 2, 2023
@gbj
Copy link
Collaborator

gbj commented Mar 2, 2023

Beautiful! And thanks for raising it... I feel much more confident that we're not doing anything wrong, having worked through it all, than I did before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants