Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z: Short circuit mem cmp sign macro op #7220

Merged

Conversation

ehsankianifar
Copy link
Contributor

@ehsankianifar ehsankianifar commented Jan 9, 2024

If the addresses of two arrays are equal, those two arrays are considered equal and we do not need to actually compare those arrays. In this PR, we modified the instructions for x86 and there was a discussion to do the same for other architectures. On Z, it is more complicated and depending on the conditions and expected response, several methods implement array compare.
This change targets memory compare implementation. If the Array cmp asks for a signed response (-1/0/1), the memCmp implementation would be called.

generateLoop();
setInRemainder(true);
generateRemainder();

_cursor = generateCmpResult(_cg, _rootNode, _dstReg, _srcReg, _resultReg, _falseLabel, _trueLabel, _doneLabel);
TR::RegisterDependencyConditions * dependencies = generateDependencies();
_cursor->setDependencyConditions(dependencies);
if(_startControlFlow==NULL)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if is not necessary any more since the control flow always start from the short circuit instructions

@ehsankianifar
Copy link
Contributor Author

Tested for JDK8 (LoZ and zOS) Axxon 63728, test level.sanity, all tests pass
JDK11 (LoZ) Build #20093, test sanity.functional sanity.system sanity.openjdk, all tests pass.
JDK17 (LoZ) Build #20115, test sanity.functional sanity.system sanity.openjdk, 3 functional tests fail but all 3 are related to open ssl issues.

if(_startControlFlow->getOpCodeValue() == TR::InstOpCode::assocreg) _startControlFlow=_startControlFlow->getNext();
}

_startControlFlow=cursorBefore->getNext();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be the one place where we should be refactoring the code. Couple of things seems wrong to me.

  1. Internal Control Flow starts with the BRC instruction you are generating for short circuit. So it should now be simple to have startICF label generated, and we do not need to go on after generating all instructions to find startICF.
  2. I do not think we need to attach post dependency conditions to the start of internal control flow. Your snippet from instruction selection log suggests that we can fix that.

Copy link
Contributor Author

@ehsankianifar ehsankianifar Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. I start apply this change right now.

static TR::Instruction *
generateShortCircuitInstructions(TR::CodeGenerator * cg, TR::Node * node, TR::Register * result, TR::Register * src, TR::Register * dst, TR::LabelSymbol * label)
{
TR::LabelSymbol * cFlowRegionStart = generateLabelSymbol(cg);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Control flow always start here

_startControlFlow=cursorBefore->getNext();
if(_startControlFlow->getOpCodeValue() == TR::InstOpCode::assocreg) _startControlFlow=_startControlFlow->getNext();
}
if(_startControlFlow != _cursor)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is always a control flow therefore it is not needed

generateS390LabelInstruction(_cg, TR::InstOpCode::label, _rootNode, cFlowRegionStart, dependencies, _startControlFlow->getPrev());
cFlowRegionStart->setStartInternalControlFlow();
TR::RegisterDependencyConditions * dependencies = generateDependencies();
generateS390LabelInstruction(_cg, TR::InstOpCode::label, _rootNode, cFlowRegionEnd, dependencies);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Control flow ends after generating results. In case of short circuiting, we skip generating results since result reg has the desired value of zero.

@@ -1556,32 +1557,24 @@ MemCmpConstLenMacroOp::generate(TR::Register* dstReg, TR::Register* srcReg, TR::
_litReg = NULL;
TR::Compilation *comp = _cg->comp();

if (_length == 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the length is zero then arrays are equal and we should return zero in result register. No other instructions needed.

@ehsankianifar
Copy link
Contributor Author

Tree structure for variable length arrays:

 [     0x2aa43d5e570]	                       L       GPR_0019, Parm[Parm  2<parm 2 I>] ?+16(GPR15)
 [     0x2aa43d7c4b0]	                       LG      &GPR_0020, Parm[Parm  0<parm 0 L>] ?+0(GPR15)
 [     0x2aa43d7c700]	                       LG      &GPR_0022, Parm[Parm  1<parm 1 L>] ?+8(GPR15)
 [     0x2aa43d7cae0]	                       SGRK    GPR_0024,&GPR_0022,&GPR_0020	
 [     0x2aa43d7cba0]	                       Label L0021:	# (Start of internal control flow)	
 [     0x2aa43d7cc60]	                       BRC     BRNL(0x8), Label L0018	
 [     0x2aa43d7cf20]	                       LLILF   GPR_0025,0	
 [     0x2aa43d7d0b0]	                       SRAG    GPR_0026,GPR_0019,8
 [     0x2aa43d7d170]	                       BRC     BH(0x8), Label L0023	
 [     0x2aa43d7d230]	                       Label L0022:	
 [     0x2aa43d7d470]	                       CLC     #391 0(256,&GPR_0022),#390 0(&GPR_0020)
 [     0x2aa43d7d530]	                       BRC     BNH(0x6), Label L0016	
 [     0x2aa43d7d6b0]	                       LA      &GPR_0020,#392 256(&GPR_0020)
 [     0x2aa43d7d820]	                       LA      &GPR_0022,#393 256(&GPR_0022)
 [     0x2aa43d7d8d0]	                       BRCT    GPR_0026,Label L0022	
 [     0x2aa43d7d990]	                       Label L0023:	
 [     0x2aa43d7dab0]	                       RISBGN  GPR_0019,GPR_0019,52,187,4	
 [     0x2aa43d7dc40]	                       BAS     GPR_0025,#394 0(GPR_0019,GPR_0025)
 [     0x2aa43d7dcf0]	                       Label L0024:	
 [     0x2aa43d7ddb0]	                       Label L0016:	
 [     0x2aa43d7de70]	                       IPM     GPR_0024,GPR_0024	
 [     0x2aa43d7df20]	                       SLLG    GPR_0024,GPR_0024,34
 [     0x2aa43d7dfe0]	                       SRAG    GPR_0024,GPR_0024,62
 [     0x2aa43d7e0a0]	                       Label L0018:	
 [     0x2aa43d7e820]	                       assocreg
 [     0x2aa43d7e280]	                       Label L0020:	# (End of internal control flow)	
 POST:
 {GPR14:GPR_0025:R} {GPR1:&GPR_0022:R} {GPR2:&GPR_0020:R} {GPR0:GPR_0026:R} {AssignAny:GPR_0024:R} {AssignAny:GPR_0019:R}
 [     0x2aa43d7e980]	                       LGFR    GPR_0024,GPR_0024	
 [     0x2aa43d7efe0]	                       assocreg
 [     0x2aa43d7ea30]	                       retn    	
 POST:
 {GPR2:GPR_0024:R}

@ehsankianifar
Copy link
Contributor Author

Instruction for constant length array:"


 [     0x2aa0fb083c0]	                       LG      &GPR_0017, Parm[Parm  0<parm 0 L>] ?+0(GPR15)
 [     0x2aa0fb263d0]	                       LG      &GPR_0019, Parm[Parm  1<parm 1 L>] ?+8(GPR15)
 [     0x2aa0fb265b0]	                       SGRK    GPR_0021,&GPR_0019,&GPR_0017	
 [     0x2aa0fb26670]	                       Label L0021:	# (Start of internal control flow)	
 [     0x2aa0fb26730]	                       BRC     BRNL(0x8), Label L0018	
 [     0x2aa0fb26970]	                       CLC     #390 0(256,&GPR_0019),#389 0(&GPR_0017)
 [     0x2aa0fb26a30]	                       BRC     BNH(0x6), Label L0016	
 [     0x2aa0fb26c70]	                       CLC     #392 256(244,&GPR_0019),#391 256(&GPR_0017)
 [     0x2aa0fb26d30]	                       Label L0016:	
 [     0x2aa0fb26df0]	                       IPM     GPR_0021,GPR_0021	
 [     0x2aa0fb26ea0]	                       SLLG    GPR_0021,GPR_0021,34
 [     0x2aa0fb26f60]	                       SRAG    GPR_0021,GPR_0021,62
 [     0x2aa0fb27020]	                       Label L0018:	
 [     0x2aa0fb27730]	                       assocreg
 [     0x2aa0fb27190]	                       Label L0020:	# (End of internal control flow)	
 POST:
 {AssignAny:&GPR_0019:R} {AssignAny:&GPR_0017:R} {AssignAny:GPR_0021:R}
 [     0x2aa0fb27890]	                       LGFR    GPR_0021,GPR_0021	
 [     0x2aa0fb27ef0]	                       assocreg
 [     0x2aa0fb27940]	                       retn    	
 POST:
 {GPR2:GPR_0021:R}

For large arrays (5000 byte length):

 [     0x2aa3f3553c0]	                       LG      &GPR_0017, Parm[Parm  0<parm 0 L>] ?+0(GPR15)
 [     0x2aa3f3733d0]	                       LG      &GPR_0019, Parm[Parm  1<parm 1 L>] ?+8(GPR15)
 [     0x2aa3f3735b0]	                       SGRK    GPR_0021,&GPR_0019,&GPR_0017	
 [     0x2aa3f373670]	                       Label L0021:	# (Start of internal control flow)	
 [     0x2aa3f373730]	                       BRC     BRNL(0x8), Label L0018	
 [     0x2aa3f373960]	                       LGHI    GPR_0022,0x13	
 [     0x2aa3f373a20]	                       Label L0022:	
 [     0x2aa3f373c60]	                       CLC     #390 0(256,&GPR_0019),#389 0(&GPR_0017)
 [     0x2aa3f373d20]	                       BRC     BNH(0x6), Label L0016	
 [     0x2aa3f373ea0]	                       LA      &GPR_0017,#391 256(&GPR_0017)
 [     0x2aa3f374010]	                       LA      &GPR_0019,#392 256(&GPR_0019)
 [     0x2aa3f3740c0]	                       BRCT    GPR_0022,Label L0022	
 [     0x2aa3f374180]	                       Label L0023:	
 [     0x2aa3f3743c0]	                       CLC     #394 0(136,&GPR_0019),#393 0(&GPR_0017)
 [     0x2aa3f374480]	                       Label L0016:	
 [     0x2aa3f374540]	                       IPM     GPR_0021,GPR_0021	
 [     0x2aa3f3745f0]	                       SLLG    GPR_0021,GPR_0021,34
 [     0x2aa3f3746b0]	                       SRAG    GPR_0021,GPR_0021,62
 [     0x2aa3f374770]	                       Label L0018:	
 [     0x2aa3f374e80]	                       assocreg
 [     0x2aa3f3748e0]	                       Label L0020:	# (End of internal control flow)	
 POST:
 {GPR1:&GPR_0019:RD} {GPR2:&GPR_0017:RD} {GPR0:GPR_0022:R} {AssignAny:GPR_0021:R}
 [     0x2aa3f374fe0]	                       LGFR    GPR_0021,GPR_0021	
 [     0x2aa3f375640]	                       assocreg
 [     0x2aa3f375090]	                       retn    	
 POST:
 {GPR2:GPR_0021:R}

@ehsankianifar
Copy link
Contributor Author

@r30shah I removed dependency conditions from start of control flow and put the start of control flow right were in insert the first branch. Please let me know if it looks good to you.

@ehsankianifar
Copy link
Contributor Author

Tested for JDK8 (LoZ and zOS) Axxon 64107, test level.sanity, all tests pass
JDK11 (LoZ) Build #20196, test sanity.functional sanity.system sanity.openjdk, all tests pass.

Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ehsankianifar for testing this through various testing pipelines and looking at the instruction selection snippet, it looks correct.

@r30shah
Copy link
Contributor

r30shah commented Jan 17, 2024

Jenkins build zos,zlinux

@r30shah
Copy link
Contributor

r30shah commented Jan 17, 2024

@hzongaro Can I request you to merge this change based on the testing @ehsankianifar has done (#7220 (comment)) ?

Copy link
Contributor

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over the code quickly. The changes look good to me, though I'm relying heavily on @r30shah's earlier review. I just have a couple of comments about indenting.

compiler/z/codegen/OpMemToMem.cpp Outdated Show resolved Hide resolved
compiler/z/codegen/OpMemToMem.cpp Outdated Show resolved Hide resolved
If addresses of two blocks of memory are equal, the content is equal too.
We can jump to the end label without comparing the content.
Four methods are updated to handle signed and unsigned result as well as constant and variable length.

Signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>
Copy link
Contributor

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@hzongaro
Copy link
Contributor

x86-64 macOS failure is due to issue #7181. Relevant testing was successful. Merging.

@hzongaro hzongaro merged commit d16d36b into eclipse-omr:master Jan 23, 2024
4 of 6 checks passed
@hzongaro hzongaro self-assigned this Feb 12, 2024
@ehsankianifar ehsankianifar deleted the ShortCircuitMemCmpSignMacroOp branch March 1, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants