Skip to content

Commit 8946ed7

Browse files
asgibbonsTobiHartmannlahodajalbertnetymkHamlin Li
authored
Asgibbons crc32c (#7)
* Use existing CRC32 code with different table for CRC32-C * 8275643: C2's unaryOp vector intrinsic does not properly handle LongVector.neg Reviewed-by: chagedorn, sviswanathan * 8277213: CompileTask_lock is acquired out of order with MethodCompileQueue_lock Reviewed-by: rbackman, coleenp * 8273039: JShell crashes when naming variable or method "abstract" or "strictfp" Reviewed-by: vromero * 8277324: C2 compilation fails with "bad AD file" on x86-32 after JDK-8276162 due to missing match rule Reviewed-by: chagedorn, roland * 8277371: Remove unnecessary DefNewGeneration::ref_processor_init() Reviewed-by: stefank, tschatzl, mli * 8277439: G1: Correct include guard name in G1EvacFailureObjectsSet.hpp Reviewed-by: tschatzl, sjohanss * 8277449: compiler/vectorapi/TestLongVectorNeg.java fails with release VMs Reviewed-by: thartmann, chagedorn * 8276774: Cookie stored in CookieHandler not sent if user headers contain cookie Reviewed-by: michaelm * 8277427: Update jib-profiles.js to use JMH 1.33 devkit Reviewed-by: shade, erikj * 8275745: Reproducible copyright headers Reviewed-by: ihse, erikj * 8276150: Quarantined jpackage apps are labeled as "damaged" Reviewed-by: almatvee * 8275887: jarsigner prints invalid digest/signature algorithm warnings if keysize is weak/disabled Reviewed-by: weijun * 8277212: GC accidentally cleans valid megamorphic vtable inline caches Reviewed-by: eosterlund, pliden, coleenp, thartmann * 8277342: vmTestbase/nsk/stress/strace/strace004.java fails with SIGSEGV in InstanceKlass::jni_id_for Reviewed-by: dholmes, hseigel * 8274949: Use String.contains() instead of String.indexOf() in java.base Reviewed-by: weijun, dfuchs, vtewari, lancea * 8274333: Redundant null comparison after Pattern.split Reviewed-by: mullan, weijun, rriggs, iris * 8274946: Cleanup unnecessary calls to Throwable.initCause() in java.rmi Reviewed-by: iris, rriggs * 8275386: Change nested classes in jdk.jlink to static nested classes Reviewed-by: alanb, rriggs, iris * 8277092: TestMetaspaceAllocationMT2.java#ndebug-default fails with "RuntimeException: Committed seems high: NNNN expected at most MMMM" Reviewed-by: coleenp * 8277370: configure script cannot distinguish WSL version Reviewed-by: erikj * 8273544: Increase test coverage for snippets Reviewed-by: jjg * 8277494: [BACKOUT] JDK-8276150 Quarantined jpackage apps are labeled as "damaged" Reviewed-by: asemenyuk, tschatzl * 8276662: Scalability bottleneck in SymbolTable::lookup_common() Reviewed-by: redestad, dholmes, iklam, shade * 8272773: Configurable card table card size Reviewed-by: tschatzl, ayang * 8277485: Zero: Fix _fast_{i,f}access_0 bytecodes handling Reviewed-by: sgehwolf, shade * 8224922: Access JavaFileObject from Element(s) Co-authored-by: Jan Lahoda <jlahoda@openjdk.org> Reviewed-by: jjg * 8275448: [REDO] AArch64: Implement string_compare intrinsic in SVE Reviewed-by: ngasson, aph * 8277385: Zero: Enable CompactStrings support Reviewed-by: redestad, adinn * 8277534: Remove unused ReferenceProcessor::has_discovered_references Reviewed-by: tschatzl * 8266593: vmTestbase/nsk/jvmti/PopFrame/popframe011 fails with "assert(java_thread == _state->get_thread()) failed: Must be" Reviewed-by: mdoerr, lmesnik, dcubed * 8277428: G1: Move and inline G1STWIsAliveClosure::do_object_b Reviewed-by: tschatzl, sjohanss * 8273792: JumpableGenerator.rngs() documentation refers to wrong method Co-authored-by: Guy Steele <gls@openjdk.org> Reviewed-by: rriggs * 8274685: Documentation suggests there are ArbitrarilyJumpableGenerator when none Co-authored-by: Guy Steele <gls@openjdk.org> Reviewed-by: rriggs * 8277239: SIGSEGV in vrshift_reg_maskedNode::emit Reviewed-by: sviswanathan, dlong * 8277522: Make formatting of null consistent in Elements Reviewed-by: jlahoda * 8265795: vmTestbase/nsk/jvmti/AttachOnDemand/attach022/TestDescription.java fails when running with JEP 416 Reviewed-by: sspitsyn, dholmes * 8277429: Conflicting jpackage static library name Reviewed-by: almatvee, herrick, erikj * 8273341: Update Siphash to version 1.0 Reviewed-by: dholmes * 8264297: Create implementation for NSAccessibilityProgressIndicator protocol peer Reviewed-by: pbansal * 8277576: ProblemList runtime/ErrorHandling/CreateCoredumpOnCrash.java on macosx-X64 8277577: ProblemList compiler/onSpinWait/TestOnSpinWaitAArch64DefaultFlags.java on linux-aarch64 8277578: ProblemList applications/jcstress/acqrel.java on linux-aarch64 Reviewed-by: mikael * 8277423: ciReplay: hidden class with comment expected error Reviewed-by: chagedorn, thartmann * 8273095: vmTestbase/vm/mlvm/anonloader/stress/oome/heap/Test.java fails with "wrong OOME" Reviewed-by: shade, stefank * 8277542: G1: Move G1CardSetFreePool and related classes to separate files Reviewed-by: ayang, tschatzl * 8277507: Add jlink.debug system property while launching jpackage tests to help diagonize recent intermittent failures Reviewed-by: almatvee * 8277087: ZipException: zip END header not found at ZipFile#Source.findEND Reviewed-by: lancea * 8276216: Negated character classes performance regression in Pattern Reviewed-by: clanger * 8277556: Call ReferenceProcessorPhaseTimes::set_processing_is_mt once Reviewed-by: sjohanss, tschatzl * 8277560: Remove WorkerDataArray::_is_serial Reviewed-by: sjohanss, tschatzl * 8277413: Remove unused local variables in jdk.hotspot.agent Reviewed-by: lmesnik, tschatzl, sspitsyn * 8277441: CompileQueue::add fails with assert(_last->next() == __null) failed: not last Reviewed-by: chagedorn, neliasso * 8276696: ParallelObjectIterator freed at the wrong time in VM_HeapDumper Reviewed-by: pliden, stefank * 8272042: java.util.ImmutableCollections$Map1 and MapN should not be @valuebased Reviewed-by: mchung, iris, naoto, smarks * 8277649: [BACKOUT] JDK-8277507 Add jlink.debug system property while launching jpackage tests to help diagonize recent intermittent failures Reviewed-by: alanb, stefank * 8254108: ciReplay: Support incremental inlining Reviewed-by: dlong, thartmann * 8261847: performance of java.lang.Record::toString should be improved Reviewed-by: jlaskey, redestad * 8268725: jshell does not support the --enable-native-access option Reviewed-by: sundar * 8277350: runtime/jni/checked/TestPrimitiveArrayCriticalWithBadParam.java times out Reviewed-by: hseigel, dholmes, lmesnik * 8277451: java.lang.reflect.Field::set on static field with invalid argument type should throw IAE Reviewed-by: alanb * 8271623: Omit enclosing instance fields from inner classes that don't use it Reviewed-by: vromero, jlahoda * 8276764: Enable deterministic file content ordering for Jar and Jmod Reviewed-by: mchung, ihse * 8265796: vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java fails when running with JEP 416 Reviewed-by: cjplummer, mchung * 8277503: compiler/onSpinWait/TestOnSpinWaitAArch64DefaultFlags.java failed with "OnSpinWaitInst with the expected value 'isb' not found." Reviewed-by: chagedorn, aph, phh * 8277397: ZGC: Add JFR event for temporary latency measurements Reviewed-by: eosterlund, jbachorik, pliden, mgronlun * 8277399: ZGC: Move worker thread logging out of gc+phase=debug Reviewed-by: eosterlund, pliden * 8273328: Compiler implementation for Pattern Matching for switch (Second Preview) Reviewed-by: vromero, mcimadamore * 8277562: Remove dead method c1 If::swap_sux Reviewed-by: thartmann, neliasso * 8277042: add test for 8276036 to compiler/codecache Reviewed-by: chagedorn, thartmann * 8275063: Implementation of Foreign Function & Memory API (Second incubator) Reviewed-by: erikj, psandoz, jvernee, darcy * 8275320: NMT should perform buffer overrun checks 8275320: NMT should perform buffer overrun checks 8275301: Unify C-heap buffer overrun checks into NMT Reviewed-by: simonis, zgu * 8276665: ObjectInputStream.GetField.get(name, object) should throw ClassNotFoundException Reviewed-by: naoto, lancea, smarks * 8272728: javac ignores any -J option in @argfiles silently Reviewed-by: jjg * 8235876: Misleading warning message in java source-file mode Reviewed-by: vromero * 8274161: Cleanup redundant casts in jdk.compiler Reviewed-by: vromero * 8264605: vmTestbase/nsk/jvmti/SuspendThread/suspendthrd003/TestDescription.java failed with "agent_tools.cpp, 471: (foundThread = (jthread) jni_env->NewGlobalRef(foundThread)) != NULL" Reviewed-by: sspitsyn, dholmes * 8276124: Provide snippet support for properties files Co-authored-by: Jonathan Gibbons <jjg@openjdk.org> Co-authored-by: Hannes Wallnöfer <hannesw@openjdk.org> Reviewed-by: jjg * 8277806: 4 tools/jar failures per platform after JDK-8272728 Reviewed-by: alanb, jjg * 8277811: ProblemList vmTestbase/nsk/jdi/TypeComponent/isSynthetic/issynthetic001/TestDescription.java 8277813: ProblemList vmTestbase/nsk/jvmti/AttachOnDemand/attach002a/TestDescription.java Reviewed-by: dholmes * 8258117: jar tool sets the time stamp of module-info.class entries to the current time Reviewed-by: lancea, ihse, alanb * 8270435: UT: MonitorUsedDeflationThresholdTest failed: did not find too_many string in output Reviewed-by: dholmes * 8275687: runtime/CommandLine/PrintTouchedMethods test shouldn't catch RuntimeException Reviewed-by: iklam, chagedorn * 8277631: ZGC: CriticalMetaspaceAllocation asserts Reviewed-by: pliden, stefank, dholmes * 8277786: G1: Rename log2_card_region_per_heap_region used in G1CardSet Reviewed-by: ayang, tschatzl, mli * 8277825: Remove unused ReferenceProcessorPhaseTimes::_sub_phases_total_time_ms Reviewed-by: tschatzl * 8277504: Use String.stripTrailing instead of hand-crafted method in SwingUtilities2 Reviewed-by: pbansal, serb * 8277165: jdeps --multi-release --print-module-deps fails if module-info.class in different versioned directories 8277166: Data race in jdeps VersionHelper 8277123: jdeps does not report some exceptions correctly Reviewed-by: jvernee, alanb * 8277659: [TESTBUG] Microbenchmark ThreadOnSpinWaitProducerConsumer.java hangs Reviewed-by: njian, ngasson * 8277508: need to check has_predicated_vectors before calling scalable_predicate_reg_slots Reviewed-by: njian, thartmann, ngasson * 8277417: C1 LIR instruction for load-klass Reviewed-by: iveresov, mdoerr, ngasson, aph * 8275330: C2: assert(n->is_Root() || n->is_Region() || n->is_Phi() || n->is_MachMerge() || def_block->dominates(block)) failed: uses must be dominated by definitions Reviewed-by: thartmann, chagedorn * 8277139: Improve code readability in PredecessorValidator (c1_IR.cpp) Reviewed-by: thartmann, chagedorn * 8277860: PPC: Remove duplicate info != NULL check Reviewed-by: chagedorn, mdoerr * 8277411: C2 fast_unlock intrinsic on AArch64 has unnecessary ownership check Reviewed-by: ngasson, neliasso * 8275908: Record null_check traps for calls and array_check traps in the interpreter Reviewed-by: chagedorn, mdoerr * 8276685: Malformed Javadoc inline tags in JDK source in /jdk/management/jfr/RecordingInfo.java Reviewed-by: mgronlun * 8276670: G1: Rename G1CardSetFreePool and related classes Reviewed-by: tschatzl, ayang * Use existing CRC32 code with different table for CRC32-C Co-authored-by: Tobias Hartmann <thartmann@openjdk.org> Co-authored-by: Jan Lahoda <jlahoda@openjdk.org> Co-authored-by: Albert Mingkun Yang <ayang@openjdk.org> Co-authored-by: Hamlin Li <mli@openjdk.org> Co-authored-by: Jie Fu <jiefu@openjdk.org> Co-authored-by: Daniel Fuchs <dfuchs@openjdk.org> Co-authored-by: Claes Redestad <redestad@openjdk.org> Co-authored-by: Magnus Ihse Bursie <mag@icus.se> Co-authored-by: Andy Herrick <herrick@openjdk.org> Co-authored-by: Sean Mullan <mullan@openjdk.org> Co-authored-by: Stefan Karlsson <stefank@openjdk.org> Co-authored-by: Coleen Phillimore <coleenp@openjdk.org> Co-authored-by: Andrey Turbanov <turbanoff@gmail.com> Co-authored-by: Thomas Stuefe <stuefe@openjdk.org> Co-authored-by: Yasumasa Suenaga <ysuenaga@openjdk.org> Co-authored-by: Pavel Rappo <prappo@openjdk.org> Co-authored-by: Daniel D. Daugherty <dcubed@openjdk.org> Co-authored-by: Derek White <drwhite@openjdk.org> Co-authored-by: Vishal Chand <vishalchand2492@gmail.com> Co-authored-by: Joe Darcy <darcy@openjdk.org> Co-authored-by: TatWai Chong <tatwai.chong@arm.com> Co-authored-by: Aleksey Shipilev <shade@openjdk.org> Co-authored-by: Serguei Spitsyn <sspitsyn@openjdk.org> Co-authored-by: Jim Laskey <jlaskey@openjdk.org> Co-authored-by: Guy Steele <gls@openjdk.org> Co-authored-by: Jatin Bhateja <jbhateja@openjdk.org> Co-authored-by: Leonid Mesnik <lmesnik@openjdk.org> Co-authored-by: Alexey Semenyuk <asemenyuk@openjdk.org> Co-authored-by: Alexander Zuev <kizune@openjdk.org> Co-authored-by: Dean Long <dlong@openjdk.org> Co-authored-by: Jaikiran Pai <jpai@openjdk.org> Co-authored-by: Sergey Bylokhov <serb@openjdk.org> Co-authored-by: Volker Simonis <simonis@openjdk.org> Co-authored-by: Erik Österlund <eosterlund@openjdk.org> Co-authored-by: Roger Riggs <rriggs@openjdk.org> Co-authored-by: Christian Hagedorn <chagedorn@openjdk.org> Co-authored-by: Vicente Romero <vromero@openjdk.org> Co-authored-by: Mandy Chung <mchung@openjdk.org> Co-authored-by: Liam Miller-Cushon <cushon@openjdk.org> Co-authored-by: Andrew Leonard <aleonard@openjdk.org> Co-authored-by: Evgeny Astigeevich <eastig@amazon.com> Co-authored-by: Ludvig Janiuk <ludvig.j.janiuk@oracle.com> Co-authored-by: KIRIYAMA Takuya <kiriyama.takuya@fujitsu.com> Co-authored-by: Maurizio Cimadamore <mcimadamore@openjdk.org> Co-authored-by: Christian Stein <cstein@openjdk.org> Co-authored-by: Adam Sotona <asotona@openjdk.org> Co-authored-by: Jonathan Gibbons <jjg@openjdk.org> Co-authored-by: Hannes Wallnöfer <hannesw@openjdk.org> Co-authored-by: Lance Andersen <lancea@openjdk.org> Co-authored-by: Fairoz Matte <fmatte@openjdk.org> Co-authored-by: Ivan Walulya <iwalulya@openjdk.org> Co-authored-by: Stuart Monteith <smonteith@openjdk.org> Co-authored-by: Yadong Wang <yadongwang@openjdk.org> Co-authored-by: Roman Kennke <rkennke@openjdk.org> Co-authored-by: Roland Westrelin <roland@openjdk.org> Co-authored-by: Erik Gahlin <egahlin@openjdk.org>
1 parent aec4417 commit 8946ed7

File tree

4 files changed

+54
-22
lines changed

4 files changed

+54
-22
lines changed

src/hotspot/cpu/x86/macroAssembler_x86.cpp

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7030,7 +7030,7 @@ void MacroAssembler::fold512bit_crc32_avx512(XMMRegister xcrc, XMMRegister xK, X
70307030

70317031
// Helper function for AVX 512 CRC32
70327032
// Compute CRC32 for < 256B buffers
7033-
void MacroAssembler::kernel_crc32_avx512_256B(Register crc, Register buf, Register len, Register key, Register pos,
7033+
void MacroAssembler::kernel_crc32_avx512_256B(Register crc, Register buf, Register len, Register table, Register pos,
70347034
Register tmp1, Register tmp2, Label& L_barrett, Label& L_16B_reduction_loop,
70357035
Label& L_get_last_two_xmms, Label& L_128_done, Label& L_cleanup) {
70367036

@@ -7043,7 +7043,7 @@ void MacroAssembler::kernel_crc32_avx512_256B(Register crc, Register buf, Regist
70437043
jcc(Assembler::less, L_less_than_32);
70447044

70457045
// if there is, load the constants
7046-
movdqu(xmm10, Address(key, 1 * 16)); //rk1 and rk2 in xmm10
7046+
movdqu(xmm10, Address(table, 1 * 16)); //rk1 and rk2 in xmm10
70477047
movdl(xmm0, crc); // get the initial crc value
70487048
movdqu(xmm7, Address(buf, pos, Address::times_1, 0 * 16)); //load the plaintext
70497049
pxor(xmm7, xmm0);
@@ -7070,7 +7070,7 @@ void MacroAssembler::kernel_crc32_avx512_256B(Register crc, Register buf, Regist
70707070
pxor(xmm7, xmm0); //xor the initial crc value
70717071
addl(pos, 16);
70727072
subl(len, 16);
7073-
movdqu(xmm10, Address(key, 1 * 16)); // rk1 and rk2 in xmm10
7073+
movdqu(xmm10, Address(table, 1 * 16)); // rk1 and rk2 in xmm10
70747074
jmp(L_get_last_two_xmms);
70757075

70767076
bind(L_less_than_16_left);
@@ -7190,12 +7190,17 @@ void MacroAssembler::kernel_crc32_avx512_256B(Register crc, Register buf, Regist
71907190
* param crc register containing existing CRC (32-bit)
71917191
* param buf register pointing to input byte buffer (byte*)
71927192
* param len register containing number of bytes
7193+
* param table address of crc or crc32c table
71937194
* param tmp1 scratch register
71947195
* param tmp2 scratch register
71957196
* return rax result register
7197+
*
7198+
* This routine is identical for crc32c with the exception of the precomputed constant
7199+
* table which will be passed as the table argument. The calculation steps are
7200+
* the same for both variants.
71967201
*/
7197-
void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register len, Register key, Register tmp1, Register tmp2) {
7198-
assert_different_registers(crc, buf, len, key, tmp1, tmp2, rax);
7202+
void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register len, Register table, Register tmp1, Register tmp2) {
7203+
assert_different_registers(crc, buf, len, table, tmp1, tmp2, rax, r12);
71997204

72007205
Label L_tail, L_tail_restore, L_tail_loop, L_exit, L_align_loop, L_aligned;
72017206
Label L_fold_tail, L_fold_128b, L_fold_512b, L_fold_512b_loop, L_fold_tail_loop;
@@ -7210,7 +7215,6 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
72107215
// For EVEX with VL and BW, provide a standard mask, VL = 128 will guide the merge
72117216
// context for the registers used, where all instructions below are using 128-bit mode
72127217
// On EVEX without VL and BW, these instructions will all be AVX.
7213-
lea(key, ExternalAddress(StubRoutines::x86::crc_table_avx512_addr()));
72147218
notl(crc);
72157219
movl(pos, 0);
72167220

@@ -7225,15 +7229,15 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
72257229
evmovdquq(xmm0, Address(buf, pos, Address::times_1, 0 * 64), Assembler::AVX_512bit);
72267230
evmovdquq(xmm4, Address(buf, pos, Address::times_1, 1 * 64), Assembler::AVX_512bit);
72277231
evpxorq(xmm0, xmm0, xmm10, Assembler::AVX_512bit);
7228-
evbroadcasti32x4(xmm10, Address(key, 2 * 16), Assembler::AVX_512bit); //zmm10 has rk3 and rk4
7232+
evbroadcasti32x4(xmm10, Address(table, 2 * 16), Assembler::AVX_512bit); //zmm10 has rk3 and rk4
72297233

72307234
subl(len, 256);
72317235
cmpl(len, 256);
72327236
jcc(Assembler::less, L_fold_128_B_loop);
72337237

72347238
evmovdquq(xmm7, Address(buf, pos, Address::times_1, 2 * 64), Assembler::AVX_512bit);
72357239
evmovdquq(xmm8, Address(buf, pos, Address::times_1, 3 * 64), Assembler::AVX_512bit);
7236-
evbroadcasti32x4(xmm16, Address(key, 0 * 16), Assembler::AVX_512bit); //zmm16 has rk-1 and rk-2
7240+
evbroadcasti32x4(xmm16, Address(table, 0 * 16), Assembler::AVX_512bit); //zmm16 has rk-1 and rk-2
72377241
subl(len, 256);
72387242

72397243
bind(L_fold_256_B_loop);
@@ -7279,8 +7283,8 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
72797283
// at this point, the buffer pointer is pointing at the last y Bytes of the buffer, where 0 <= y < 128
72807284
// the 128B of folded data is in 8 of the xmm registers : xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7
72817285
bind(L_fold_128_B_register);
7282-
evmovdquq(xmm16, Address(key, 5 * 16), Assembler::AVX_512bit); // multiply by rk9-rk16
7283-
evmovdquq(xmm11, Address(key, 9 * 16), Assembler::AVX_512bit); // multiply by rk17-rk20, rk1,rk2, 0,0
7286+
evmovdquq(xmm16, Address(table, 5 * 16), Assembler::AVX_512bit); // multiply by rk9-rk16
7287+
evmovdquq(xmm11, Address(table, 9 * 16), Assembler::AVX_512bit); // multiply by rk17-rk20, rk1,rk2, 0,0
72847288
evpclmulqdq(xmm1, xmm0, xmm16, 0x01, Assembler::AVX_512bit);
72857289
evpclmulqdq(xmm2, xmm0, xmm16, 0x10, Assembler::AVX_512bit);
72867290
// save last that has no multiplicand
@@ -7289,7 +7293,7 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
72897293
evpclmulqdq(xmm5, xmm4, xmm11, 0x01, Assembler::AVX_512bit);
72907294
evpclmulqdq(xmm6, xmm4, xmm11, 0x10, Assembler::AVX_512bit);
72917295
// Needed later in reduction loop
7292-
movdqu(xmm10, Address(key, 1 * 16));
7296+
movdqu(xmm10, Address(table, 1 * 16));
72937297
vpternlogq(xmm1, 0x96, xmm2, xmm5, Assembler::AVX_512bit); // xor ABC
72947298
vpternlogq(xmm1, 0x96, xmm6, xmm7, Assembler::AVX_512bit); // xor ABC
72957299

@@ -7305,7 +7309,7 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
73057309
jcc(Assembler::less, L_final_reduction_for_128);
73067310

73077311
bind(L_16B_reduction_loop);
7308-
vpclmulqdq(xmm8, xmm7, xmm10, 0x1);
7312+
vpclmulqdq(xmm8, xmm7, xmm10, 0x01);
73097313
vpclmulqdq(xmm7, xmm7, xmm10, 0x10);
73107314
vpxor(xmm7, xmm7, xmm8, Assembler::AVX_128bit);
73117315
movdqu(xmm0, Address(buf, pos, Address::times_1, 0 * 16));
@@ -7336,14 +7340,14 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
73367340
vpshufb(xmm2, xmm2, xmm0, Assembler::AVX_128bit);
73377341

73387342
blendvpb(xmm2, xmm2, xmm1, xmm0, Assembler::AVX_128bit);
7339-
vpclmulqdq(xmm8, xmm7, xmm10, 0x1);
7343+
vpclmulqdq(xmm8, xmm7, xmm10, 0x01);
73407344
vpclmulqdq(xmm7, xmm7, xmm10, 0x10);
73417345
vpxor(xmm7, xmm7, xmm8, Assembler::AVX_128bit);
73427346
vpxor(xmm7, xmm7, xmm2, Assembler::AVX_128bit);
73437347

73447348
bind(L_128_done);
73457349
// compute crc of a 128-bit value
7346-
movdqu(xmm10, Address(key, 3 * 16));
7350+
movdqu(xmm10, Address(table, 3 * 16));
73477351
movdqu(xmm0, xmm7);
73487352

73497353
// 64b fold
@@ -7359,14 +7363,14 @@ void MacroAssembler::kernel_crc32_avx512(Register crc, Register buf, Register le
73597363
jmp(L_barrett);
73607364

73617365
bind(L_less_than_256);
7362-
kernel_crc32_avx512_256B(crc, buf, len, key, pos, tmp1, tmp2, L_barrett, L_16B_reduction_loop, L_get_last_two_xmms, L_128_done, L_cleanup);
7366+
kernel_crc32_avx512_256B(crc, buf, len, table, pos, tmp1, tmp2, L_barrett, L_16B_reduction_loop, L_get_last_two_xmms, L_128_done, L_cleanup);
73637367

73647368
//barrett reduction
73657369
bind(L_barrett);
73667370
vpand(xmm7, xmm7, ExternalAddress(StubRoutines::x86::crc_by128_masks_avx512_addr() + 1 * 16), Assembler::AVX_128bit, tmp2);
73677371
movdqu(xmm1, xmm7);
73687372
movdqu(xmm2, xmm7);
7369-
movdqu(xmm10, Address(key, 4 * 16));
7373+
movdqu(xmm10, Address(table, 4 * 16));
73707374

73717375
pclmulqdq(xmm7, xmm10, 0x0);
73727376
pxor(xmm7, xmm2);

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6528,6 +6528,7 @@ address generate_avx_ghash_processBlocks() {
65286528
if (VM_Version::supports_sse4_1() && VM_Version::supports_avx512_vpclmulqdq() &&
65296529
VM_Version::supports_avx512bw() &&
65306530
VM_Version::supports_avx512vl()) {
6531+
__ lea(table, ExternalAddress(StubRoutines::x86::crc_table_avx512_addr()));
65316532
__ kernel_crc32_avx512(crc, buf, len, table, tmp1, tmp2);
65326533
} else {
65336534
__ kernel_crc32(crc, buf, len, table, tmp1);
@@ -6569,26 +6570,34 @@ address generate_avx_ghash_processBlocks() {
65696570
const Register j = r9;
65706571
const Register k = r10;
65716572
const Register l = r11;
6573+
const Register table = r12;
65726574
#ifdef _WIN64
65736575
const Register y = rdi;
65746576
const Register z = rsi;
65756577
#else
65766578
const Register y = rcx;
65776579
const Register z = r8;
65786580
#endif
6579-
assert_different_registers(crc, buf, len, a, j, k, l, y, z);
6581+
assert_different_registers(crc, buf, len, a, j, k, l, y, z, table);
65806582

65816583
BLOCK_COMMENT("Entry:");
65826584
__ enter(); // required for proper stackwalking of RuntimeStub frame
65836585
#ifdef _WIN64
65846586
__ push(y);
65856587
__ push(z);
65866588
#endif
6587-
__ crc32c_ipl_alg2_alt2(crc, buf, len,
6588-
a, j, k,
6589-
l, y, z,
6590-
c_farg0, c_farg1, c_farg2,
6591-
is_pclmulqdq_supported);
6589+
if (VM_Version::supports_sse4_1() && VM_Version::supports_avx512_vpclmulqdq() &&
6590+
VM_Version::supports_avx512bw() &&
6591+
VM_Version::supports_avx512vl()) {
6592+
__ lea(table, ExternalAddress(StubRoutines::x86::crc32c_table_avx512_addr()));
6593+
__ kernel_crc32_avx512(crc, buf, len, table, l, k);
6594+
} else {
6595+
__ crc32c_ipl_alg2_alt2(crc, buf, len,
6596+
a, j, k,
6597+
l, y, z,
6598+
c_farg0, c_farg1, c_farg2,
6599+
is_pclmulqdq_supported);
6600+
}
65926601
__ movl(rax, crc);
65936602
#ifdef _WIN64
65946603
__ pop(z);

src/hotspot/cpu/x86/stubRoutines_x86.cpp

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,23 @@ juint StubRoutines::x86::_crc_table_avx512[] =
221221
0x00000000UL, 0x00000000UL, 0x00000000UL, 0x00000000UL
222222
};
223223

224+
juint StubRoutines::x86::_crc32c_table_avx512[] =
225+
{
226+
0xb9e02b86UL, 0x00000000UL, 0xdcb17aa4UL, 0x00000000UL,
227+
0x493c7d27UL, 0x00000000UL, 0xc1068c50UL, 0x0000000eUL,
228+
0x06e38d70UL, 0x00000002UL, 0x6992cea2UL, 0x00000000UL,
229+
0x493c7d27UL, 0x00000000UL, 0xdd45aab8UL, 0x00000000UL,
230+
0xdea713f0UL, 0x00000000UL, 0x05ec76f0UL, 0x00000001UL,
231+
0x47db8317UL, 0x00000000UL, 0x2ad91c30UL, 0x00000000UL,
232+
0x0715ce53UL, 0x00000000UL, 0xc49f4f67UL, 0x00000000UL,
233+
0x39d3b296UL, 0x00000000UL, 0x083a6eecUL, 0x00000000UL,
234+
0x9e4addf8UL, 0x00000000UL, 0x740eef02UL, 0x00000000UL,
235+
0xddc0152bUL, 0x00000000UL, 0x1c291d04UL, 0x00000000UL,
236+
0xba4fc28eUL, 0x00000000UL, 0x3da6d0cbUL, 0x00000000UL,
237+
0x493c7d27UL, 0x00000000UL, 0xc1068c50UL, 0x0000000eUL,
238+
0x00000000UL, 0x00000000UL, 0x00000000UL, 0x00000000UL
239+
};
240+
224241
juint StubRoutines::x86::_crc_by128_masks_avx512[] =
225242
{
226243
0xffffffffUL, 0xffffffffUL, 0x00000000UL, 0x00000000UL,

src/hotspot/cpu/x86/stubRoutines_x86.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ class x86 {
137137
#ifdef _LP64
138138
static juint _crc_by128_masks_avx512[];
139139
static juint _crc_table_avx512[];
140+
static juint _crc32c_table_avx512[];
140141
static juint _shuf_table_crc32_avx512[];
141142
static juint _adler32_shuf0_table[];
142143
static juint _adler32_shuf1_table[];
@@ -256,6 +257,7 @@ class x86 {
256257
static address crc_by128_masks_avx512_addr() { return (address)_crc_by128_masks_avx512; }
257258
static address shuf_table_crc32_avx512_addr() { return (address)_shuf_table_crc32_avx512; }
258259
static address crc_table_avx512_addr() { return (address)_crc_table_avx512; }
260+
static address crc32c_table_avx512_addr() { return (address)_crc32c_table_avx512; }
259261
static address ghash_polynomial512_addr() { return _ghash_poly512_addr; }
260262
#endif // _LP64
261263
static address ghash_long_swap_mask_addr() { return _ghash_long_swap_mask_addr; }

0 commit comments

Comments
 (0)