You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to troubleshoot a performance issue I'm having with Stagemonitor but it only actually is a problem when running in a container. To isolate this problem I've made a change to the Sprint Pet Clinic demo setup for Stagemonitor to introduce a method that gets called a lot in the request e.g. 1 million times (just to amplify this issue) - you can find the change here. The change is to the "Veterinarian" tab, basically what I'm seeing is:
Local java + stagemonitor OFF = < 200ms
Local java + stagemonitor ON = < 200ms
Containerized docker + stagemonitor OFF = < 200ms
Containerized docker + stagemonitor ON = ~7 seconds
So for some reason this overhead is only really noticeable with the combination of container + stagemonitor. Running outside of a container with stagemonitor works fine and running in a container with stagemonitor works fine, just the combination is producing large overhead somehow. The docker image I'm using is openjdk:8u162-jdk, can find more details on its Docker hub page here. I used similar Java versions although locally I was running 8u161.
Any ideas? I'm trying to think of other variables here to narrow this down. I tried taking a look at the JIT logs via -XX:+PrintCompilation but nothing immediately stood out as being different between docker + stagemonitor ON vs local + stagemonitor ON. I've seen this issue in production on longer lived JVMs so I don't think it's some of the common benchmarking pitfalls like not waiting long enough for JIT compilation to have occurred. Our setup in stagemonitor is to basically include instrumentation only on our packages e.g. "com.mycompany". With this now though we've added some excludes to try to weed out those granular/low level methods but it's not really ideal and given this works fine locally not sure we would even have to do this.
I didn't get a chance to try other images yet. I've also reproduced this on EC2 instances running Linux although for my testing above I was using Docker for Mac (so probably not true apples to apples but you can see the local/stagemonitor off vs container/stagemonitor off performance is pretty much the same)
The text was updated successfully, but these errors were encountered:
I'm trying to troubleshoot a performance issue I'm having with Stagemonitor but it only actually is a problem when running in a container. To isolate this problem I've made a change to the Sprint Pet Clinic demo setup for Stagemonitor to introduce a method that gets called a lot in the request e.g. 1 million times (just to amplify this issue) - you can find the change here. The change is to the "Veterinarian" tab, basically what I'm seeing is:
So for some reason this overhead is only really noticeable with the combination of container + stagemonitor. Running outside of a container with stagemonitor works fine and running in a container with stagemonitor works fine, just the combination is producing large overhead somehow. The docker image I'm using is openjdk:8u162-jdk, can find more details on its Docker hub page here. I used similar Java versions although locally I was running 8u161.
Any ideas? I'm trying to think of other variables here to narrow this down. I tried taking a look at the JIT logs via
-XX:+PrintCompilation
but nothing immediately stood out as being different between docker + stagemonitor ON vs local + stagemonitor ON. I've seen this issue in production on longer lived JVMs so I don't think it's some of the common benchmarking pitfalls like not waiting long enough for JIT compilation to have occurred. Our setup in stagemonitor is to basically include instrumentation only on our packages e.g. "com.mycompany". With this now though we've added some excludes to try to weed out those granular/low level methods but it's not really ideal and given this works fine locally not sure we would even have to do this.I didn't get a chance to try other images yet. I've also reproduced this on EC2 instances running Linux although for my testing above I was using Docker for Mac (so probably not true apples to apples but you can see the local/stagemonitor off vs container/stagemonitor off performance is pretty much the same)
The text was updated successfully, but these errors were encountered: