Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getProcessCpuLoad() always return -1 in first call on jdk11 #13389

Closed
Sidong-Wei opened this issue Aug 25, 2021 · 19 comments · Fixed by #18451 or #18636
Closed

getProcessCpuLoad() always return -1 in first call on jdk11 #13389

Sidong-Wei opened this issue Aug 25, 2021 · 19 comments · Fixed by #18451 or #18636

Comments

@Sidong-Wei
Copy link

Java -version output

openjdk 11.0.12 2021-07-20
IBM Semeru Runtime Open Edition 11.0.12.0 (build 11.0.12+7)
Eclipse OpenJ9 VM 11.0.12.0 (build openj9-0.27.0, JRE 11 Linux amd64-64-Bit Compressed References 20210729_228 (JIT enabled, AOT enabled)
OpenJ9   - 1851b0074
OMR      - 9db1c870d
JCL      - 21849e2ca0 based on jdk-11.0.12+7)

Summary of problem

The getProcessCpuLoad() method of class com.sun.management.OperatingSystemMXBean always return -1 (invaild ret value) in the first call, while in the following calls it will get the correct cpu load value.
I tried to call the same function using OpenJDK11 and AdoptOpenJDK11 with Hotspot, and it will return 0 with these JDKs which is within the valid ret value range. So I expect Openj9 should also have the same behavior for the first call. Here is a simple test code that could reproduce the issue:

import java.lang.management.ManagementFactory;
import com.sun.management.OperatingSystemMXBean;

class Test{
  public static void main(String[] args) {
    while (true) {
        try
        {
            Thread.sleep(10000);
        }
        catch(InterruptedException ex)
        {
            Thread.currentThread().interrupt();
        }
        double cpuUtil = ((com.sun.management.OperatingSystemMXBean) ManagementFactory.getOperatingSystemMXBean()).getProcessCpuLoad();
        System.out.println(cpuUtil);
    }
  }
}

Diagnostic files

The output I got with different JDK variants is:

sidong@sidong-u18-x1:~$ java --version
openjdk 11.0.12 2021-07-20
OpenJDK Runtime Environment Temurin-11.0.12+7 (build 11.0.12+7)
OpenJDK 64-Bit Server VM Temurin-11.0.12+7 (build 11.0.12+7, mixed mode)
sidong@sidong-u18-x1:~$ java Test
0.0
0.001002004008016032
5.00751126690035E-4
0.0010035122930255895
5.00751126690035E-4
5.015045135406219E-4
sidong@sidong-u18-x1:~$ java --version
openjdk 11.0.12 2021-07-20
IBM Semeru Runtime Open Edition 11.0.12.0 (build 11.0.12+7)
Eclipse OpenJ9 VM 11.0.12.0 (build openj9-0.27.0, JRE 11 Linux amd64-64-Bit Compressed References 20210729_228 (JIT enabled, AOT enabled)
OpenJ9   - 1851b0074
OMR      - 9db1c870d
JCL      - 21849e2ca0 based on jdk-11.0.12+7)
sidong@sidong-u18-x1:~$ java Test
-1.0
0.0031686181725441823
2.0872289188482128E-4
2.556149850413811E-4
3.1896885482841735E-4
1.4443471779581417E-4
2.252603035480593E-4

I also tried the previous stable release (11.0.11+9) and the current openj9 branch and the output is similar.

@DanHeidinga
Copy link
Member

fyi @keithc-ca

@keithc-ca
Copy link
Contributor

keithc-ca commented Aug 31, 2021

The behavior is consistent with the specification at [1]:

If the Java Virtual Machine recent CPU usage is not available, the method returns a negative value.

[1] https://docs.oracle.com/en/java/javase/11/docs/api/jdk.management/com/sun/management/OperatingSystemMXBean.html#getProcessCpuLoad()

@Sidong-Wei
Copy link
Author

@keithc-ca Thanks for your reply, yeah I understand that the negative return value is allowed, and also we know the first call probably could not retrieve the correct recent cpu usage, so the behavior is expected.
I guess my point is that since we already know the first call will always return -1 which is not implying any potential issue with the system, could we sort of differentiating the negative return values between the first time and later calls? i.e., the first call return -1 is normal so we could let it skip, but later calls returning -1 means there really might be something wrong there. If the first -1 could be filtered into a non-negative value (like 0 in OpenJDK), then it probably would not trigger any warnings or errors on the developer side, while the following negative values could keep being monitored.
Please let me know whether you think that will make sense, thank you.

@keithc-ca
Copy link
Contributor

@Sidong-Wei It's not clear to me whether you're suggesting changes to your application (ignore the first -1 returned), or suggesting that OpenJ9 return 0 instead of -1 from the first call. The latter does seem reasonable to me, but we'll have to consider existing use cases (and tests) that might be affected by such a change.

@Sidong-Wei
Copy link
Author

@keithc-ca Yeah I also prefer the latter option a bit more. Changes to the application would definitely work and it could be a workaround for the issue, but I think if it could be resolved at the source then it would be easier for downstream applications to just monitor negative returns normally.
If you would like to take this latter option into consideration, that will be good enough for me now, since it is not a critical issue and at least there exists a workaround for it. Thanks.

@pshipton
Copy link
Member

Controlling the behavior via a command line option is one way to avoid compatibility problems.

@keithc-ca
Copy link
Contributor

@pshipton, I assume you mean the option would restore the old (current) behavior, otherwise the default would still be inconsistent with openjdk? Got a proposal for the option name?

The behavior of getSystemCpuLoad() should mirror that of getProcessCpuLoad() (either both or neither should return -1 the first time either is called).

@pshipton
Copy link
Member

For compatibility I suggest the default behavior remain the same, although we can change it on a release boundary (Java 18 or maybe Java 17 if we do it soon) and document the change. -XX:[+/-]CpuLoadCompatibility

@jonathan-albrecht-ibm
Copy link

I'm wondering if I could get an update on where this issue stands? Is there a plan to implement it in a particular release?

@pshipton
Copy link
Member

@tajila question about scheduling this.

@jonathan-albrecht-ibm
Copy link

Just wanted to check in on the status of this issue again.

@pshipton
Copy link
Member

pshipton commented Feb 7, 2023

Is there a plan to implement it in a particular release?

It's in the backlog, so no.

@tajila
Copy link
Contributor

tajila commented Nov 9, 2023

@thallium Please take a look at this

@thallium
Copy link
Contributor

thallium commented Nov 9, 2023

@thallium Please take a look at this

@TobiAjila Does that mean I need to implement the -XX:[+/-]CpuLoadCompatibility flag?

@tajila
Copy link
Contributor

tajila commented Nov 9, 2023

@TobiAjila Does that mean I need to implement the -XX:[+/-]CpuLoadCompatibility flag?

Yes, and you'll need to change the behaviour of getProcessCpuLoad

@thallium
Copy link
Contributor

@TobiAjila Does that mean I need to implement the -XX:[+/-]CpuLoadCompatibility flag?

Yes, and you'll need to change the behaviour of getProcessCpuLoad

@TobiAjila As suggested in #13389 (comment), do we also want to change behaviour of getSystemCpuLoad()? getSystemCpuLoad() calls omrsysinfo_get_CPU_load() which returns OMRPORT_ERROR_OPFAILED on the first call so we can't tell if it's the first call on the OpenJ9 side. Should I change it to return another error code (maybe OMRPORT_ERROR_INVALID) instead?

@tajila
Copy link
Contributor

tajila commented Nov 10, 2023

@TobiAjila As suggested in #13389 (comment), do we also want to change behaviour of getSystemCpuLoad()?

Yes, both should mirror each other.

so we can't tell if it's the first call on the OpenJ9 side. Should I change it to return another error code (maybe OMRPORT_ERROR_INVALID) instead?

I think its safe to assume that the only the JVM will be calling this. Its possible that someone writes a native and calls before using the MXBean api, but in that case I think we would behave similar to the RI. We can verifiy that to be sure.

In any case, the main change here is that, instead of returning -1, we return zero. We are not tracking how many times the call has been made.

@thallium
Copy link
Contributor

In any case, the main change here is that, instead of returning -1, we return zero. We are not tracking how many times the call has been made.

I'd assume you mean returning 0 from omrsysinfo_get_CPU_load(), but then what about the compatibility flag? If we return 0, we can't tell if it's the first-time call or the CPU load is actually 0?

@tajila
Copy link
Contributor

tajila commented Nov 13, 2023

I'd assume you mean returning 0 from omrsysinfo_get_CPU_load(), but then what about the compatibility flag? If we return 0, we can't tell if it's the first-time call or the CPU load is actually 0?

omrsysinfo_get_CPU_load() implicitly tracks if it is being called the first time based on whether oldestCPUTime is set. We may not need to make any changes to omrsysinfo_get_CPU_load as long as we can guarantee that the only time OMRPORT_ERROR_OPFAILED is returned is when omrsysinfo_get_CPU_load is called for the first time. If that is not the case then we will need a new error code to return instead.

The main changes will be made to Java_com_ibm_lang_management_internal_ExtendedOperatingSystemMXBeanImpl_getProcessVirtualMemorySizeImpl. If the error code for first call is returned, then we check the option which will tell us whether to return -1 or 0.

thallium added a commit to thallium/openj9 that referenced this issue Dec 13, 2023
Currently, getProcessCpuLoad() and getSystemCpuLoad() return -1 if only
one data point has been recorded. A compatibility flag
-XX:[+/-]CpuLoadCompatibility is added to match the behaviour of the RI,
which is to return 0.

Fixes: eclipse-openj9#13389
Related: eclipse-omr/omr#7189

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
thallium added a commit to thallium/openj9 that referenced this issue Dec 15, 2023
Currently, getProcessCpuLoad() and getSystemCpuLoad() return -1 if only
one data point has been recorded. A compatibility flag
-XX:[+/-]CpuLoadCompatibility is added to match the behaviour of the RI,
which is to return 0.

Fixes: eclipse-openj9#13389
Related: eclipse-omr/omr#7189

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
thallium added a commit to thallium/openj9 that referenced this issue Dec 17, 2023
Currently, getProcessCpuLoad() and getSystemCpuLoad() return -1 if only
one data point has been recorded. A compatibility flag
-XX:[+/-]CpuLoadCompatibility is added to match the behaviour of the RI,
which is to return 0.

Fixes: eclipse-openj9#13389
Related: eclipse-omr/omr#7189

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment