Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

omrsysinfo_get_CPU_load() return a new error for insufficient data #7189

Merged
merged 1 commit into from
Dec 6, 2023

Conversation

thallium
Copy link
Contributor

@thallium thallium commented Nov 22, 2023

This is to determine if a call to omrsysinfo_get_CPU_load has only
one data point recorded.

A new error, OMRPORT_ERROR_INSUFFICIENT_DATA, is returned
in this scenario.

Also fixed a mistake in the description saying that the first two
invocations return OMRPORT_ERROR_OPFAILED and fix the time unit in the
description to ms.

Related: eclipse-openj9/openj9#18451

@thallium
Copy link
Contributor Author

@tajila FYI

port/unix/omrsysinfo.c Outdated Show resolved Hide resolved
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for supporting the project, and congratulations on your first contribution! A project committer will shortly review your contribution. In the mean time, if you haven't had a chance please skim over the contribution guidelines which all pull requests must adhere to. If the ECA pull request check fails, have a look at the instructions for signing the ECA in the legal considerations section.

If you run into any problems our community will be happy to assist you in any way we can. There are a number of recommended ways to interact with the community. We encourage you to ask questions, or drop by to say hello.

@tajila
Copy link
Contributor

tajila commented Nov 22, 2023

@babsingh Please review these changes

@babsingh
Copy link
Contributor

jenkins build all

@babsingh
Copy link
Contributor

Currently, this PR only updates the unix impl. The win32 impl also needs to be updated:
https://github.com/eclipse/omr/blob/8b19b8082de56b067c23bafd59dd767a58d8c02f/port/win32/omrsysinfo.c#L1271-L1275

@babsingh
Copy link
Contributor

Also, the function description needs to be updated to reflect the new error handling:
https://github.com/eclipse/omr/blob/8b19b8082de56b067c23bafd59dd767a58d8c02f/port/common/omrsysinfo.c#L643-L663

@babsingh
Copy link
Contributor

@thallium
Copy link
Contributor Author

Also, the function description needs to be updated to reflect the new error handling:

https://github.com/eclipse/omr/blob/8b19b8082de56b067c23bafd59dd767a58d8c02f/port/common/omrsysinfo.c#L643-L663

Why does it says "on the first two invocations"? Is that a typo?

@babsingh
Copy link
Contributor

babsingh commented Nov 23, 2023

Why does it says "on the first two invocations"? Is that a typo?

I am unable to confirm by just reading the below design and implementation; it doesn't seem straightforward.

@thallium Can you verify by adding print statements for the return value in the existing test (see below)? I can run PR builds on all platforms once you have added debug code in the test. on the first two invocations: might be platform specific.

https://github.com/eclipse/omr/blob/8b19b8082de56b067c23bafd59dd767a58d8c02f/fvtest/porttest/si.cpp#L1689-L1716

@babsingh
Copy link
Contributor

babsingh commented Nov 23, 2023

@fjeremic For #7189 (comment), do you remember why the second invocation of omrsysinfo_get_CPU_load also returns OMRPORT_ERROR_OPFAILED?

@thallium
Copy link
Contributor Author

@babsingh I've added the debug output, can you run the test?

@babsingh
Copy link
Contributor

jenkins build all(test:'-R porttest',env:'GTEST_FILTER=PortSysinfoTest.sysinfo_test_get_CPU_load',compile:'-j8')

If the printf output is not seen, then the log methods in the test suite will need to be used.

@babsingh
Copy link
Contributor

jenkins build all(test:'-R porttest',env:'GTEST_FILTER=PortSysinfoTest.sysinfo_test_get_CPU_load')

@babsingh
Copy link
Contributor

Windows compilation failed:

00:25:44      53>c:\omr\workspace\pullrequest-win_x86-64\build\fvtest\porttest\si.cpp(1705): error C2220: warning treated as error - no 'object' file generated [C:\omr\workspace\PullRequest-win_x86-64\Build\build\fvtest\porttest\omrporttest.vcxproj]
00:25:44      53>c:\omr\workspace\pullrequest-win_x86-64\build\fvtest\porttest\si.cpp(1705): warning C4244: '=': conversion from 'intptr_t' to 'int', possible loss of data [C:\omr\workspace\PullRequest-win_x86-64\Build\build\fvtest\porttest\omrporttest.vcxproj]
00:25:44      53>c:\omr\workspace\pullrequest-win_x86-64\build\fvtest\porttest\si.cpp(1707): warning C4244: '=': conversion from 'intptr_t' to 'int', possible loss of data [C:\omr\workspace\PullRequest-win_x86-64\Build\build\fvtest\porttest\omrporttest.vcxproj]

Test output:

00:43:49  15: Return value of first invocation: -21
00:43:49  15: Return value of second invocation: -1

I only checked a few build jobs. I consistently saw the above test output. @thallium Verify the test output in the PR builds as well. Fix the Windows compilation issue and verify the test output on Windows.

@babsingh
Copy link
Contributor

babsingh commented Nov 24, 2023

The test output for Windows is also the same:

00:43:49  15: Return value of first invocation: -21
00:43:49  15: Return value of second invocation: -1

The second OMRPORT_ERROR_OPFAILED is probably returned from the end of the function. @tajila I don't believe the current changes are sufficient to support eclipse-openj9/openj9#18451. Based upon the above behaviour, there is still ambiguity between the insufficient data and insufficient time cases.

@tajila
Copy link
Contributor

tajila commented Dec 4, 2023

Intent of eclipse-openj9/openj9#18451: Make get[Process|System]CpuLoad() return 0 on the first call.

I don't think this is fully accurate. The real intent is to return zero if we have one data point instead of -1 as the RI does. This is usually the case on the first call.

can the RI still throw an error in the first call?

Im not certain, ubt I would expect that to be the case. We can try to verify this. For example, if /proc/stat was not reachable then I would expect RI to fail as well and not return zero.

If RI doesn't do this, then they are losing information, because returning 0 when there is a hard failure means there is no way to tell that there is a functional issue. In which case, I would still be okay with the behaviour being proposed here.

@babsingh
Copy link
Contributor

babsingh commented Dec 5, 2023

@thallium Please post an update in the PR once the below tasks are completed.

  • Update the documentation to reflect .... The real intent is to return zero if we have one data point instead of -1 as the RI does.
  • Verify RI's behaviour if /proc/stat is not reachable.

@thallium
Copy link
Contributor Author

thallium commented Dec 5, 2023

Verify RI's behaviour if /proc/stat is not reachable.

RI returns -1 in this situation.

@babsingh
Copy link
Contributor

babsingh commented Dec 6, 2023

Thanks @thallium for verifying RI's behaviour. Can you also update the commit message to reflect the new documentation?

Return another error if only one data point has been recorded

This is to determine if a call to omrsysinfo_get_CPU_load has only
one data point recorded.

...

@thallium thallium force-pushed the master branch 2 times, most recently from 4897e4a to 5c4e691 Compare December 6, 2023 15:32
@babsingh
Copy link
Contributor

babsingh commented Dec 6, 2023

new line (return) is missing from the first line.

image

@babsingh babsingh changed the title Make omrsysinfo_get_CPU_load() return another error code on first call omrsysinfo_get_CPU_load() return a new error for insufficient data Dec 6, 2023
@babsingh
Copy link
Contributor

babsingh commented Dec 6, 2023

still a space is missing onlyone -> only one. i have updated the PR description for you.
image

This is to determine if a call to omrsysinfo_get_CPU_load has only one
data point recorded.

Also fixed a mistake in the description saying that the first two
invocations return OMRPORT_ERROR_OPFAILED and fix the time unit in the
description to "ms".

Related: eclipse-openj9/openj9#18451

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
@babsingh
Copy link
Contributor

babsingh commented Dec 6, 2023

jenkins build all

@babsingh
Copy link
Contributor

babsingh commented Dec 6, 2023

jenkins build aix

@babsingh
Copy link
Contributor

babsingh commented Dec 6, 2023

Only a known issue is observed:

@babsingh babsingh merged commit e5e7d55 into eclipse-omr:master Dec 6, 2023
16 of 18 checks passed
thallium added a commit to thallium/openj9 that referenced this pull request Dec 13, 2023
Currently, getProcessCpuLoad() and getSystemCpuLoad() return -1 if only
one data point has been recorded. A compatibility flag
-XX:[+/-]CpuLoadCompatibility is added to match the behaviour of the RI,
which is to return 0.

Fixes: eclipse-openj9#13389
Related: eclipse-omr/omr#7189

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
thallium added a commit to thallium/openj9 that referenced this pull request Dec 15, 2023
Currently, getProcessCpuLoad() and getSystemCpuLoad() return -1 if only
one data point has been recorded. A compatibility flag
-XX:[+/-]CpuLoadCompatibility is added to match the behaviour of the RI,
which is to return 0.

Fixes: eclipse-openj9#13389
Related: eclipse-omr/omr#7189

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
thallium added a commit to thallium/openj9 that referenced this pull request Dec 17, 2023
Currently, getProcessCpuLoad() and getSystemCpuLoad() return -1 if only
one data point has been recorded. A compatibility flag
-XX:[+/-]CpuLoadCompatibility is added to match the behaviour of the RI,
which is to return 0.

Fixes: eclipse-openj9#13389
Related: eclipse-omr/omr#7189

Signed-off-by: Gengchen Tuo <gengchen.tuo@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants