You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed if an application has too many threads (15k or more), the jmx_exporter can cause a program to hang. It'll hang the main thread however long it takes to finish the jmx_exporter process (10+ seconds in my tests). I wrote a simple script that can reproduce the issue:
public static void main(String[] args) throws Exception {
final int count = 15_000;
final Thread[] thread = new Thread[count];
for(int i=0; i<thread.length; i++) {
thread[i] = new Thread(() -> {
while(true) {
try {
Thread.sleep(500);
} catch (Exception e) {}
}
});
thread[i].start();
}
while(true) {
System.out.println("[time="+System.currentTimeMillis()+"]");
Thread.sleep(100);
}
}
Basically if you run this with the jmx_exporter and call curl http://localhost:123 in the background, it'll freeze the main thread intermittently (about 30% of the time). You might have to adjust some of the timings for it to appear.
I traced the source of the delay to this class ThreadExports.java class, lines 110-124. There is a filter, if enabled, would disable JVM_THREADS_STATE / jvm_threads_state. Enabling this filter prevents the issue from happening.
The problem, and the reason I'm reporting this as an issue, is there's no way to disable just the jvm_threads_state process in jmx_exporter. All of the rules in the config gets executed after the collectors run, not before. I believe the fix would be to pass down information to the HTTPServer.java class. Then, instead of calling metricFamilySamples(), use filteredMetricFamilySamples().
Note: It is possible to disable JVM metric in the curl call to the server, aka curl http://localhost:123?name[]=my_metric but this is extremely limited. In particular, you have to select metrics by name. You can't use regex or negation. Put another way, if you have 3,000 metrics and you want to filter out 1, you would have to list 2,999 using this technique. Ideally, the solution should be part of the jmx_exporter config.
The text was updated successfully, but these errors were encountered:
Here's some sample output (I add %100_000 to the print statement in the main loop for readability):
[time=82539]
[time=82652]
[time=82767]
[time=82875] <--- Moment in which the curl command was called
[time=10397]
[time=10809]
[time=10918]
[time=11027]
[time=11137]
In this sample, calling jmx_exporter locks the main thread for 20 seconds. As mentioned, though, it's not consistent. I'd estimate about 30% of the time depending on your local hardware and number of threads.
Per this issue, I created a Pull Request that offers a fix: #760
I could have also modified code in java_client, such as the HTTPServer.java class, but since this class already offered Predicate<String> sampleNameFilter , I used that instead.
Using the PR with the following config prevents the main thread from locking up while allowing all other metrics to go through:
Even in the case that main thread doesn't lock up, it shortens the time to call curl http://locahost:123 from 20 seconds to 1 seconds in my earlier example.
Version: jmx_exporter 0.17.2
I noticed if an application has too many threads (15k or more), the
jmx_exporter
can cause a program to hang. It'll hang the main thread however long it takes to finish the jmx_exporter process (10+ seconds in my tests). I wrote a simple script that can reproduce the issue:Basically if you run this with the jmx_exporter and call
curl http://localhost:123
in the background, it'll freeze the main thread intermittently (about 30% of the time). You might have to adjust some of the timings for it to appear.I traced the source of the delay to this class ThreadExports.java class, lines 110-124. There is a filter, if enabled, would disable
JVM_THREADS_STATE
/jvm_threads_state
. Enabling this filter prevents the issue from happening.The problem, and the reason I'm reporting this as an issue, is there's no way to disable just the
jvm_threads_state
process in jmx_exporter. All of therules
in the config gets executed after the collectors run, not before. I believe the fix would be to pass down information to the HTTPServer.java class. Then, instead of callingmetricFamilySamples()
, usefilteredMetricFamilySamples()
.Note: It is possible to disable JVM metric in the curl call to the server, aka
curl http://localhost:123?name[]=my_metric
but this is extremely limited. In particular, you have to select metrics by name. You can't use regex or negation. Put another way, if you have 3,000 metrics and you want to filter out 1, you would have to list 2,999 using this technique. Ideally, the solution should be part of the jmx_exporter config.The text was updated successfully, but these errors were encountered: