Seeing memory leak under normal usage. #92

SkPhilipp · 2017-06-02T09:05:12Z

Hi, we're using image proxy in a docker container, running on our on-premise Kubernetes environment.

When trying it out in a load test for the same image, we see memory use going up, and then our system killing the process eventually when it takes up too much memory. This happens very quickly when we have somewhat of a reasonable load on it. Our docker image is practically the same as https://hub.docker.com/r/willnorris/imageproxy/~/dockerfile/ except that we start it with a domain whitelist. Our configuration with which it is started is CMD /go/bin/imageproxy -addr 0.0.0.0:80 -whitelist (... list of our domains )

We simply GET /400x,q80/https://cmgtcontent.ahold.com.kpnis.nl/cmgtcontent/media//001746500/000/001746560_001_superhero_BBQ_170523_(1).jpg. After about 20_000 requests the docker instance reaches its memory limit, after which it is killed.
Under no usage, it stays at its normal ~30 MB.

The text was updated successfully, but these errors were encountered:

willnorris · 2017-06-02T18:05:33Z

Interesting. And just to ask a dumb question which you sort of already answered, but I'll ask anyway... you're not using the in-memory cache right? (by passing -cache memory) 😄

SkPhilipp · 2017-06-05T22:36:21Z

We didn't pass any additional caching options, Memory is default right? Anyway, we're only pulling the same image with the same parameters, I'd expect the in-memory cache to fill up if I were requesting many different images, but this is just the same image over and over.

Shimmi · 2017-06-14T07:26:04Z

Hi,
we faced the same problem. Although we tried three different cache providers it did not worked well for us and it was killing the machine. We tried in-memory, S3 and file cache. We did some tests and even for one single link with same parameters we get (we think) somehow non-cached image and with each request the memory went higher and higher resulting in the proxy beeing killed requesting the same URL for approx 30 times in a row.

We saw the image was properly saved to cache (tested on S3 and file systems) but it was not loaded. For every reload of the image the image was resaved to cache.

After those tests, we configured nginx proxy cache, and the server went from beeing constantly overwhelmed and killed twice a day to almost zero utilization. Weird... :)

willnorris · 2017-06-14T16:13:34Z

Memory is default right?

Default is no caching at all.

Based on @Shimmi's additional info, this definitely sounds like an issue. I'll try to look into it.

willnorris · 2017-06-14T22:08:40Z

okay, I'm pretty sure I've figured out what is going on here. The short answer is that transformation was being performed on already cached images, resulting in a bunch of extra memory allocations and CPU usage. A fix will be pushed shortly.

A slightly longer explanation is below.

HTTP Caching

First, some background which you probably already know. There are two main ways of doing caching in HTTP. The Expires and Cache-Control headers allow the server to specify the lifetime of a resource such that a client can continue to use a cached copy for a specified amount of time. If the client already has the resource cached and it hasn't expired, then no additional round trips are needed at all to use the resource.

The Etag and Last-Modified headers allow a client that has a cached version of a resource to check with the server to see if a new version is available. This always requires an additional round trip to the server, but assuming there have been no changes, the server can simply respond with a 304 status code, which instructs the client to use their cached copy. There's no need to send the resource over the wire again.

httpcache

Imageproxy uses the gregjones/httpcache package to handle caching. For connections between imageproxy and remote servers, httpcache takes care of everything for us, including both flavors of caching mentioned above. It enforces all the right validation checks, sends etags to the server when needed, etc. If the server responds with a 304, then httpcache will return its cached copy with a 200 status. Unless you inspect the X-From-Cache header, there's no way to know if the response came from the cache or the remote server.

For "downstream" connections (those between the browser and imageproxy), we reuse the same caching headers as the upstream resource. That is, we use the same values for Expires, Cache-Control, Etag, and Last-Modified. For Expires and Cache-Control, this means that clients can just keep using their cached versions without talking to us again. But for Etag and Last-Modified, we need to handle those requests and respond with a 304 status when appropriate. Currently, this is happening here.

So, for a remote image (like this one) that serves an Etag and Last-Modified header but not an Expires or Cache-Control header, then the flow looks something like:

Check for transformed image in cache
- found it in the cache, but it doesn't have an Expires or Cache-Control, so we need to check to see if the original image has been updated
Check for original (non-transformed) image in cache
- found it in the cache, but it doesn't have an Expires or Cache-Control, so we need to check with the server to see if it's been updated
Refetch the remote image, including the etag and last-modified values in the request
- remote image hasn't been updated (got a 304 response from remote server), so use the cached versions
Transform the image as requested
Check if the original request contained etag or last-modified data
- one or both match, so we can return a 304 response

Steps 1-3 still need to happen to make sure the remote image hasn't changed, but steps 4 and 5 should be flipped. That was just an oversight on my part when I originally implemented this and never noticed. We were still returning a 304 response, so from the client's perspective everything looks fine. But inside the server, we're doing a transformation in step 4 that isn't needed.

Anyway, like I said... fix will be pushed shortly. I'll leave this bug open until you confirm that the new version seems to have fixed the problem for you.

willnorris · 2017-06-14T22:14:05Z

oh, also meant to add that I was able to easily replicate imageproxy using hundreds of megs of memory when requesting the same cached image thousands of times (rakyll/hey is great for testing this by the way). After the forthcoming fix, it stayed constant at 65mb even with 20,000 requests.

If the caching headers in the request are valid, return a 304 response instead of doing the transformation. Ref #92

Shimmi · 2017-06-15T08:02:55Z

@willnorris Many thanks! That was fast :)

I've tested it and can confirm the memory consumption is OK for me.

But we are still facing big CPU load and the images beeing re-modified (tested with the file cache).

Hitting F5 for the same URL image for 30x times resulted in high CPU load:

Also the file in file cache is beeing modified with every new request:

Behaviour, I would expect is if the source image or URL parameters does not change, the image proxy should simply take the image from the cache and do not touch it again. Not sure what causes this...

willnorris · 2017-06-20T15:40:15Z

okay, I'm going to close this as having fixed the original reported problem, the memory leak. Please open a new issue to focus on the high CPU load, and I'll try to investigate.

As for the cached file being rewritten, that will end up needing to be fixed in the httpcache package. I'd suggest opening a bug there, and I'll try and take a look at what would be involved in fixing it.

willnorris added a commit that referenced this issue Jun 15, 2017

return 304 from TransformingTransport

d64b0f8

If the caching headers in the request are valid, return a 304 response instead of doing the transformation. Ref #92

willnorris closed this as completed Jun 20, 2017

willnorris mentioned this issue Jun 21, 2017

Frequent restarts on k8s #97

Open

ccbrown mentioned this issue Jan 24, 2018

Transformations are needlessly performed and never cached when browser has up-to-date image. #134

Open

Jorgevillada mentioned this issue Oct 31, 2021

Memory Leak with invalid status code #316

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seeing memory leak under normal usage. #92

Seeing memory leak under normal usage. #92

SkPhilipp commented Jun 2, 2017 •

edited

Loading

willnorris commented Jun 2, 2017

SkPhilipp commented Jun 5, 2017

Shimmi commented Jun 14, 2017

willnorris commented Jun 14, 2017

willnorris commented Jun 14, 2017

willnorris commented Jun 14, 2017

Shimmi commented Jun 15, 2017

willnorris commented Jun 20, 2017

Seeing memory leak under normal usage. #92

Seeing memory leak under normal usage. #92

Comments

SkPhilipp commented Jun 2, 2017 • edited Loading

willnorris commented Jun 2, 2017

SkPhilipp commented Jun 5, 2017

Shimmi commented Jun 14, 2017

willnorris commented Jun 14, 2017

willnorris commented Jun 14, 2017

HTTP Caching

httpcache

willnorris commented Jun 14, 2017

Shimmi commented Jun 15, 2017

willnorris commented Jun 20, 2017

SkPhilipp commented Jun 2, 2017 •

edited

Loading