Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling allocation bytes precisely without compromising the performance #9745

Merged
merged 1 commit into from
Jun 4, 2020

Conversation

LinHu2016
Copy link
Contributor

@LinHu2016 LinHu2016 commented May 29, 2020

in order to sampling heap allocation bytes precisely without
compromising the performance, we have the below changes.

Handle instrumentableAllocateHook and
VM_OBJECT_ALLOCATE_WITHIN_THRESHOLD is still via disabling inline
allocation
Handle smapling for tracepoint is still during out of line allocation
Handle smapling for JEP331 is via setTLHSamplingTop(size)

Using fake Heap Top instead of fake Heap Alloc for disabling inline
allocation (realHeapAlloc-->realHeapTop,
set/getRealAlloc()-->set/getRealTop(), getRealSize(), getUsedSize())
Using fake Heap Top to force out of line allocation at sampling thresold
for sampling heap allocation (setTLHSamplingTop()/resetTLHSamplingTop())
setTLHSamplingTop(size) are only called in the below 3 cases
1, sampling threshold has been changed via GC-VM api
j9gc_set_allocation_sampling_interval()
2, TLH is refreshed
3, after sampling is done

Counting trace allocation byte includes allocation bytes inside TLH
Cache before flushing(_stats.bytesAllocated(true),
stats->_tlhAllocatedUsed, )
Handle traceAllocationByte for Health
Center(_oolTraceAllocationBytesForTracepoint,
oolObjectSamplingBytesGranularityForTracepoint) and traceAllocationByte
for JEP331(_traceAllocationBytesForHook,
objectSamplingBytesGranularityForHook) independently

depend on eclipse-omr/omr#5260
fix: #7740

Signed-off-by: Lin Hu linhu@ca.ibm.com

@LinHu2016
Copy link
Contributor Author

@amicic @dmitripivkine please review changes, thanks

_vmThread->nonZeroHeapTop = tlh->realHeapTop;
tlh->realHeapTop = NULL;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the question I asked about support for dual TLH mode applies here: Am I reading correctly that if we set size for both TLHs we can get 2 * size to be allocated potentially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we might not get very accurate result on nonzero case,have not found the better way to handle the non zero case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a platform that can actually have both active (btw, is X using just nonZeroHeapAlloc/Top?)? I'm willing to ignore that issue for now, if we think it will take more than a day to do/test it and we can follow up after the upcoming release (by as you said disabling one of two TLHs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe non-Zeroed TLH is using on pLinux (LE and BE) and AIX. I am not sure about current status of zLinux.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-zero TLH can be used for primitive arrays only. So to see a mismatch an application should include primitive arrays to the allocation mix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that proper handling for dual TLH case can be done later in separate change

@LinHu2016 LinHu2016 force-pushed the JEP331_update branch 9 times, most recently from 3e18140 to 86eb958 Compare June 4, 2020 13:49
in order to sampling heap allocation bytes precisely without
compromising the performance, we have the below changes.

Handle instrumentableAllocateHook and
VM_OBJECT_ALLOCATE_WITHIN_THRESHOLD is still via disabling inline
allocation
Handle smapling for tracepoint is still during out of line allocation
Handle smapling for JEP331 is via setTLHSamplingTop(size)

Using fake Heap Top instead of fake Heap Alloc for disabling inline
allocation (realHeapAlloc-->realHeapTop,
set/getRealAlloc()-->set/getRealTop(), getRealSize(), getUsedSize())
Using fake Heap Top to force out of line allocation at sampling thresold
for sampling heap allocation (setTLHSamplingTop()/resetTLHSamplingTop())
setTLHSamplingTop(size) are only called in the below 3 cases
	1, sampling threshold has been changed via GC-VM api
j9gc_set_allocation_sampling_interval()
	2, TLH is refreshed
	3, after sampling is done

Counting trace allocation byte includes allocation bytes inside TLH
Cache before flushing(_stats.bytesAllocated(true),
stats->_tlhAllocatedUsed, )
Handle traceAllocationByte for Health
Center(_oolTraceAllocationBytesForTracepoint,
oolObjectSamplingBytesGranularityForTracepoint) and traceAllocationByte
for JEP331(_traceAllocationBytesForHook,
objectSamplingBytesGranularityForHook) independently

Signed-off-by: Lin Hu <linhu@ca.ibm.com>
@dmitripivkine
Copy link
Contributor

Jenkins test sanity all jdk11

@LinHu2016
Copy link
Contributor Author

has verified the latest personal build with customer's JEP331 test
https://hyc-runtimes-jenkins.swg-devops.com/view/OpenJ9%20-%20Personal/job/Pipeline-Build-Test-Personal/6218/
/team/linhu/JEP331/result1
/team/linhu/JEP331/result2
/team/linhu/JEP331/result3

@amicic amicic added the comp:gc label Jun 4, 2020
@dmitripivkine dmitripivkine merged commit f07d574 into eclipse-openj9:master Jun 4, 2020
U8Pointer realHeapAlloc = adjustedToRange(vmThread.allocateThreadLocalHeap().realHeapAlloc(), base, top);
if(realHeapAlloc.notNull() && isSomethingToAdd(realHeapAlloc, heapTop)) {
excludedRangeList.add(new U8Pointer[] {realHeapAlloc, heapTop});
U8Pointer realHeapTop = adjustedToRange(vmThread.allocateThreadLocalHeap().realHeapTop(), base, top);
Copy link
Contributor

@keithc-ca keithc-ca Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail with NoSuchFieldError when examining core files created before the addition of realHeapTop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Low-overhead Heap Profiling (JEP331) reports wrong allocation ratios
4 participants