Skip to content

JIT_NewS_MP_FastPortable is slow for Orchard CMS benchmark on Linux-arm64 #99552

Closed
@EgorBo

Description

@EgorBo

Orchard CMS is one of the most complicated benchmark in our test suite (way more complicated than TE benchmarks) so, presumably, it's closer to real-world. Currently, ARM64 is twice slower than x64 (comparable HW) on it while for TE benchmark the same ARM64 hardware is typically 1.2-1.5x faster than that x64 AMD machine.
The perf trace is pointing to JIT_NewS_MP_FastPortable:
image

(sorted by 'self' aks exclusive time).

The annoted asm for it:
image
Or (output of bjdump -d):

000000000033fbf0 <_Z24JIT_NewS_MP_FastPortableP21CORINFO_CLASS_STRUCT_>:
  33fbf0:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
  33fbf4:	910003fd 	mov	x29, sp
  33fbf8:	aa0003e8 	mov	x8, x0
  33fbfc:	90001920 	adrp	x0, 663000 <_ZTV12EEJitManager+0x50>
  33fc00:	f947dc01 	ldr	x1, [x0, #4024]
  33fc04:	913ee000 	add	x0, x0, #0xfb8
  33fc08:	d63f0020 	blr	x1
  33fc0c:	d53bd049 	mrs	x9, tpidr_el0
  33fc10:	b940050a 	ldr	w10, [x8, #4]
  33fc14:	f8606929 	ldr	x9, [x9, x0]
  33fc18:	a945ad20 	ldp	x0, x11, [x9, #88]
  33fc1c:	cb00016b 	sub	x11, x11, x0
  33fc20:	eb0a017f 	cmp	x11, x10
  33fc24:	54000082 	b.cs	33fc34 <_Z24JIT_NewS_MP_FastPortableP21CORINFO_CLASS_STRUCT_+0x44>  // b.hs, b.nlast
  33fc28:	aa0803e0 	mov	x0, x8
  33fc2c:	a8c17bfd 	ldp	x29, x30, [sp], #16
  33fc30:	14000006 	b	33fc48 <_Z7JIT_NewP21CORINFO_CLASS_STRUCT_>
  33fc34:	8b0a000a 	add	x10, x0, x10
  33fc38:	f9002d2a 	str	x10, [x9, #88]
  33fc3c:	f9000008 	str	x8, [x0]
  33fc40:	a8c17bfd 	ldp	x29, x30, [sp], #16
  33fc44:	d65f03c0 	ret

If I read this correctly, we spend most of them time reading data from TLS (to get current allocator ?) - I'd expect to see time spent inside the fallback call to gc, but it's not (if the trace is accurate). Another hint from the general trace is that we don't really spend a lot of time in GC itself.

PS: what is blr x1 call ? I presume it should be part of the GetThread()

cc @mangod9 @kunalspathak

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions