Internal review for the performance of VectorAPI on N2 #1

e1iu · 2022-01-25T03:06:49Z

No description provided.

Reviewed-by: dfuchs, aefimov

Reviewed-by: aivanov

…_block Reviewed-by: tschatzl, iwalulya, sjohanss

Reviewed-by: kbarrett, adinn

…Species classes Reviewed-by: iklam, ccheung

Reviewed-by: cjplummer

Reviewed-by: jnimeh, hchao

…d by JDK-8280950 Reviewed-by: jlaskey

Reviewed-by: prr, serb

Reviewed-by: serb

…NPrintWriter.println(...) method Reviewed-by: prr

Reviewed-by: redestad

Reviewed-by: chagedorn, thartmann, xliu

…t.java is failing Reviewed-by: sspitsyn, dcubed, lmesnik

…emException on Windows 11 Reviewed-by: dfuchs

…ck is enabled Reviewed-by: dfuchs

Reviewed-by: tschatzl, sjohanss

Reviewed-by: ayang, tschatzl

… signed JAR Reviewed-by: weijun, hchao

Reviewed-by: xuelei, hchao

…dlock/JavaDeadlock001/TestDescription.java from problemlist. Reviewed-by: sspitsyn

…oryTest.java Reviewed-by: cjplummer, dfuchs

Reviewed-by: mgronlun

Reviewed-by: stuefe, coleenp, xliu

Reviewed-by: djelinski, alanb, dfuchs, aefimov

Reviewed-by: iwalulya, sjohanss

Reviewed-by: aivanov

Reviewed-by: dholmes

Reviewed-by: dcubed, coleenp, lfoltan

…iption Reviewed-by: asemenyuk

Reviewed-by: rriggs

Reviewed-by: mgronlun

Reviewed-by: xuelei

Reviewed-by: mikael

Reviewed-by: darcy

…formations Reviewed-by: chagedorn

Reviewed-by: thartmann, roland

Reviewed-by: stuefe, coleenp, dholmes

Reviewed-by: iris, rriggs, bpb, lancea, mchung, scolebourne

…/compiler threads Reviewed-by: kvn, thartmann

…ames and X509Certificate::getIssuerAlternativeNames in otherName 6776681: Invalid encoding of an OtherName in X509Certificate.getAlternativeNames() Reviewed-by: mullan

Reviewed-by: pchilanomate, hseigel

Reviewed-by: coleenp, ccheung

Reviewed-by: vromero

Reviewed-by: jjg

Reviewed-by: iklam

Reviewed-by: aph, shade

This patch aims to optimize extract operation on vector for AArch64 according to Neoverse N2 and V1 software optimization guide[1][2]. Currently, extract operation is used by "Vector.lane"[3]. As SVE doesn’t have direct instruction support for such operation like "pextr"[4] in x86, the final code is as below: ``` Byte512Vector.lane(7) orr x8, xzr, #0x7 whilele p0.b, xzr, x8 lastb w10, p0, z16.b sxtb w10, w10 ``` This patch uses NEON instruction instead if the target lane is located in the NEON 128b range. For the same example above, the generated code is much simpler: ``` smov x11, v16.b[7] ``` For those cases that target lane is located out of the NEON 128b range, this patch uses EXT to shift the target to the lowest. The generated code is as below: ``` Byte512Vector.lane(63) mov z17.d, z16.d ext z17.b, z17.b, z17.b, openjdk#63 smov x10, v17.b[0] ``` hotspot/compiler/vectorapi, jdk/incubator/vector passed on SVE machine. Refs: // From Arm Neoverse N2 and V1 Software Optimization Guide // NOTE: The data inside "()" belongs to V1 +--------------+---------+------------+--------------------+ | Instruction | Latency | Throughput | Utilized Pipelines | +--------------+---------+------------+--------------------+ |WHILELE | 3| 1(1/2)| M(M0)| +--------------+---------+------------+--------------------+ |LASTB(scalar) | 5(6)| 1| V1,M0| +--------------+---------+------------+--------------------+ |EXT | 2| 2| V(V01)| +--------------+---------+------------+--------------------+ |UMOV,SMOV | 2| 1| V| +--------------+---------+------------+--------------------+ |ORR | 2| 2| V(V01)| +--------------+---------+------------+--------------------+ |INS | 2| 2(4)| V| +--------------+---------+------------+--------------------+ [1] https://developer.arm.com/documentation/pjdoc466751330-9685/latest/ [2] https://developer.arm.com/documentation/PJDOC-466751330-18256/0001 [3] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L2693 [4] https://www.felixcloutier.com/x86/pextrb:pextrd:pextrq Change-Id: I90cfc1f8deb84145f42132d58d3b211c4a8933ad

This patch optimizes Vector.withLane for 64 and 128 SPECIES. For 64- and 128-bit vector, insert operation could be implemented by ASIMD instructions, for better performance. E.g., for IntVector.SPECIES_128, "IntVector.withLane(0, (int)4)" generates code as below: ``` Before: orr w10, wzr, #0x4 index z17.s, #-16, #1 cmpeq p0.s, p7/z, z17.s, #-16 mov z17.d, z16.d mov z17.s, p0/m, w10 After orr w10, wzr, #0x4 mov v16.s[0], w10 ``` This patch also does a small enhancement for vector whose size is greater than 128 bits, that it may save 1 "DUP" just if the target index is smaller than 32. E.g., For ByteVector.SPECIES_512, "ByteVector.withLane(0, (byte)4)" generates code as below: ``` Before: index z18.b, #0, #1 mov z17.b, #0 cmpeq p0.b, p7/z, z18.b, z17.b mov z17.d, z16.d mov z17.b, p0/m, w16 After: index z17.b, #-16, #1 cmpeq p0.b, p7/z, z17.b, #-16 mov z17.d, z16.d mov z17.b, p0/m, w16 ``` Change-Id: I700a28bc2fc15b6baca03b8d8574bb17992bf4a7

This patch speeds up add/mul/min/max reduction for 64 and 128 SPECIES on N2 machine. According to Neoverse N2 software optimization guide[1], ASIMD reduction instructions are faster than SVE's for 128-bit vector size. This patch adds some rules to distinguish 64 bits and 128 bits vector size, so that for these two special cases, they can generate code the same as NEON. E.g., For ByteVector.SPECIES_128, "ByteVector.reduceLanes(VectorOperators.ADD)" generates code as below: ``` Before: orr x8, xzr, #0x10 whilelo p0.b, xzr, x8 uaddv d17, p0, z16.b smov x15, v17.b[0] add w15, w14, w15, sxtb After: addv b17, v16.16b smov x12, v17.b[0] add w12, w12, w16, sxtb ``` The performance improvements 60% ~ 100% on my test machine. [1] https://developer.arm.com/documentation/PJDOC-466751330-18256/0001 Change-Id: Id4637cd4b0b7948864780eafa84150787697e4df

This patch uses PTRUE instruction to create predicate registers for partial vector operations which would be efficient than WHILELO instruction according to the software optimization guide of N2[1] and V1[2]. [1] https://developer.arm.com/documentation/PJDOC-466751330-18256/0001 [2] https://developer.arm.com/documentation/pjdoc466751330-9685/latest/ Change-Id: I9be5aa82ab567e19c62c698dc9c4d852efd2f607

…or 64/128-bit vector sizes This patch optimizes the SVE backend implementations of Vector.lane and Vector.withLane for 64/128-bit vector size. The basic idea is to use lower costs NEON instructions when the vector size is 64/128 bits. 1. Vector.lane(int i) (Gets the lane element at lane index i) As SVE doesn’t have direct instruction support for extraction like "pextr"[1] in x86, the final code was shown as below: ``` Byte512Vector.lane(7) orr x8, xzr, #0x7 whilele p0.b, xzr, x8 lastb w10, p0, z16.b sxtb w10, w10 ``` This patch uses NEON instruction instead if the target lane is located in the NEON 128b range. For the same example above, the generated code now is much simpler: ``` smov x11, v16.b[7] ``` For those cases that target lane is located out of the NEON 128b range, this patch uses EXT to shift the target to the lowest. The generated code is as below: ``` Byte512Vector.lane(63) mov z17.d, z16.d ext z17.b, z17.b, z17.b, openjdk#63 smov x10, v17.b[0] ``` 2. Vector.withLane(int i, E e) (Replaces the lane element of this vector at lane index i with value e) For 64/128-bit vector, insert operation could be implemented by NEON instructions to get better performance. E.g., for IntVector.SPECIES_128, "IntVector.withLane(0, (int)4)" generates code as below: ``` Before: orr w10, wzr, #0x4 index z17.s, #-16, #1 cmpeq p0.s, p7/z, z17.s, #-16 mov z17.d, z16.d mov z17.s, p0/m, w10 After orr w10, wzr, #0x4 mov v16.s[0], w10 ``` This patch also does a small enhancement for vectors whose sizes are greater than 128 bits. It can save 1 "DUP" if the target index is smaller than 32. E.g., For ByteVector.SPECIES_512, "ByteVector.withLane(0, (byte)4)" generates code as below: ``` Before: index z18.b, #0, #1 mov z17.b, #0 cmpeq p0.b, p7/z, z18.b, z17.b mov z17.d, z16.d mov z17.b, p0/m, w16 After: index z17.b, #-16, #1 cmpeq p0.b, p7/z, z17.b, #-16 mov z17.d, z16.d mov z17.b, p0/m, w16 ``` With this patch, we can see up to 200% performance gain for specific vector micro benchmarks in my SVE testing system. [TEST] test/jdk/jdk/incubator/vector, test/hotspot/jtreg/compiler/vectorapi passed without failure. [1] https://www.felixcloutier.com/x86/pextrb:pextrd:pextrq Change-Id: Ic2a48f852011978d0f252db040371431a339d73c

This patch optimizes the backend implementation of VectorMaskToLong for AArch64, given a more efficient approach to mov value bits from predicate register to general purpose register as x86 PMOVMSK[1] does, by using BEXT[2] which is available in SVE2. With this patch, the final code (input mask is byte type with SPECIESE_512, generated on an SVE vector reg size of 512-bit QEMU emulator) changes as below: Before: mov z16.b, p0/z, #1 fmov x0, d16 orr x0, x0, x0, lsr openjdk#7 orr x0, x0, x0, lsr openjdk#14 orr x0, x0, x0, lsr openjdk#28 and x0, x0, #0xff fmov x8, v16.d[1] orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#8 orr x8, xzr, #0x2 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#16 orr x8, xzr, #0x3 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#24 orr x8, xzr, #0x4 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#32 mov x8, #0x5 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#40 orr x8, xzr, #0x6 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#48 orr x8, xzr, #0x7 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#56 After: mov z16.b, p0/z, #1 mov z17.b, #1 bext z16.d, z16.d, z17.d mov z17.d, #0 uzp1 z16.s, z16.s, z17.s uzp1 z16.h, z16.h, z17.h uzp1 z16.b, z16.b, z17.b mov x0, v16.d[0] [1] https://www.felixcloutier.com/x86/pmovmskb [2] https://developer.arm.com/documentation/ddi0602/2020-12/SVE-Instructions/BEXT--Gather-lower-bits-from-positions-selected-by-bitmask- Change-Id: Ia983a20c89f76403e557ac21328f2f2e05dd08e0

…th SVE2 This patch implements AArch64 codegen for VectorLongToMask using the SVE2 BitPerm feature. With this patch, the final code (generated on an SVE vector reg size of 512-bit QEMU emulator) is shown as below: mov z17.b, #0 mov v17.d[0], x13 sunpklo z17.h, z17.b sunpklo z17.s, z17.h sunpklo z17.d, z17.s mov z16.b, #1 bdep z17.d, z17.d, z16.d cmpne p0.b, p7/z, z17.b, #0 Change-Id: Ia83e80bbd879f86fef5dd607e44c530f2ce143d0

This patch optimizes the backend implementation of VectorMaskToLong for AArch64, given a more efficient approach to mov value bits from predicate register to general purpose register as x86 PMOVMSK[1] does, by using BEXT[2] which is available in SVE2. With this patch, the final code (input mask is byte type with SPECIESE_512, generated on an SVE vector reg size of 512-bit QEMU emulator) changes as below: Before: mov z16.b, p0/z, #1 fmov x0, d16 orr x0, x0, x0, lsr openjdk#7 orr x0, x0, x0, lsr openjdk#14 orr x0, x0, x0, lsr openjdk#28 and x0, x0, #0xff fmov x8, v16.d[1] orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#8 orr x8, xzr, #0x2 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#16 orr x8, xzr, #0x3 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#24 orr x8, xzr, #0x4 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#32 mov x8, #0x5 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#40 orr x8, xzr, #0x6 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#48 orr x8, xzr, #0x7 whilele p1.d, xzr, x8 lastb x8, p1, z16.d orr x8, x8, x8, lsr openjdk#7 orr x8, x8, x8, lsr openjdk#14 orr x8, x8, x8, lsr openjdk#28 and x8, x8, #0xff orr x0, x0, x8, lsl openjdk#56 After: mov z16.b, p0/z, #1 mov z17.b, #1 bext z16.d, z16.d, z17.d mov z17.d, #0 uzp1 z16.s, z16.s, z17.s uzp1 z16.h, z16.h, z17.h uzp1 z16.b, z16.b, z17.b mov x0, v16.d[0] [1] https://www.felixcloutier.com/x86/pmovmskb [2] https://developer.arm.com/documentation/ddi0602/2020-12/SVE-Instructions/BEXT--Gather-lower-bits-from-positions-selected-by-bitmask- Change-Id: Ia983a20c89f76403e557ac21328f2f2e05dd08e0

…th SVE2 This patch implements AArch64 codegen for VectorLongToMask using the SVE2 BitPerm feature. With this patch, the final code (generated on an SVE vector reg size of 512-bit QEMU emulator) is shown as below: mov z17.b, #0 mov v17.d[0], x13 sunpklo z17.h, z17.b sunpklo z17.s, z17.h sunpklo z17.d, z17.s mov z16.b, #1 bdep z17.d, z17.d, z16.d cmpne p0.b, p7/z, z17.b, #0 Change-Id: I9135fce39c8a08c72b757c78b258f5d968baa7ff

Rob McKenna and others added 30 commits February 4, 2022 13:07

8277795: ldap connection timeout not honoured under contention

3d926dd

Reviewed-by: dfuchs, aefimov

Merge

7207f2a

8280948: [TESTBUG] Write a regression test for JDK-4659800

66b2c3b

Reviewed-by: aivanov

8281120: G1: Rename G1BlockOffsetTablePart::alloc_block to update_for…

d4b99bc

…_block Reviewed-by: tschatzl, iwalulya, sjohanss

8280476: [macOS] : hotspot arm64 bug exposed by latest clang

f5d6fdd

Reviewed-by: kbarrett, adinn

8280767: -XX:ArchiveClassesAtExit does not archive BoundMethodHandle$…

8e4ef81

…Species classes Reviewed-by: iklam, ccheung

8281049: man page update for jstatd Security Manager dependency removal

48523b0

Reviewed-by: cjplummer

8281289: Improve with List.copyOf

42e272e

Reviewed-by: jnimeh, hchao

8281183: RandomGenerator:NextDouble() default behavior partially fixe…

77b0240

…d by JDK-8280950 Reviewed-by: jlaskey

8139173: [macosx] JInternalFrame shadow is not properly drawn

f7814c1

Reviewed-by: prr, serb

8279878: java/awt/font/JNICheck/JNICheck.sh test fails on Ubuntu 21.10

2f48a3f

Reviewed-by: serb

8166050: partialArray is not created in javax.swing.text.html.parser.…

5dfff74

…NPrintWriter.println(...) method Reviewed-by: prr

8281298: Revise the creation of unmodifiable list

f230282

Reviewed-by: redestad

8281117: Add regression test for JDK-8280587

f5e0870

Reviewed-by: chagedorn, thartmann, xliu

8281243: Test java/lang/instrument/RetransformWithMethodParametersTes…

95fd9d2

…t.java is failing Reviewed-by: sspitsyn, dcubed, lmesnik

8280965: Tests com/sun/net/httpserver/simpleserver fail with FileSyst…

f3e8242

…emException on Windows 11 Reviewed-by: dfuchs

8272996: JNDI DNS provider fails to resolve SRV entries when IPV6 sta…

4c16949

…ck is enabled Reviewed-by: dfuchs

8281114: G1: Remove PreservedMarks::init_forwarded_mark

7667771

Reviewed-by: tschatzl, sjohanss

8268387: Rename maximum compaction to maximal compaction in G1

22a1a32

Reviewed-by: ayang, tschatzl

8280890: Cannot use '-Djava.system.class.loader' with class loader in…

a0f6f24

… signed JAR Reviewed-by: weijun, hchao

8281175: Add a -providerPath option to jarsigner

2ed1f4c

Reviewed-by: xuelei, hchao

8281377: Remove vmTestbase/nsk/monitoring/ThreadMXBean/ThreadInfo/Dea…

1dfc94d

…dlock/JavaDeadlock001/TestDescription.java from problemlist. Reviewed-by: sspitsyn

6779701: Wrong defect ID in the code of test LocalRMIServerSocketFact…

8a66210

…oryTest.java Reviewed-by: cjplummer, dfuchs

8279613: JFR: Snippify Javadoc

2f71a6b

Reviewed-by: mgronlun

8281314: Rename Stack{Red,Yellow,Reserved,Shadow}Pages multipliers

4eacacb

Reviewed-by: stuefe, coleenp, xliu

8279329: Remove hardcoded IPv4 available policy on Windows

f2a9627

Reviewed-by: djelinski, alanb, dfuchs, aefimov

8280917: Simplify G1ConcurrentRefineThread activation

861f279

Reviewed-by: iwalulya, sjohanss

8281296: Create a regression test for JDK-4515999

f5d8ceb

Reviewed-by: aivanov

8281450: Remove unnecessary operator new and delete from ObjectMonitor

83d6745

Reviewed-by: dholmes

8281400: Remove unused wcslen() function

380378c

Reviewed-by: dcubed, coleenp, lfoltan

Alexander Matveev and others added 22 commits February 25, 2022 20:49

8279995: jpackage --add-launcher option should allow overriding descr…

fb8bf81

…iption Reviewed-by: asemenyuk

8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX

c5c6058

Reviewed-by: rriggs

8282153: JFR: Check for recording waste

cf6d256

Reviewed-by: mgronlun

8282398: EndingDotHostname.java test fails because SSL cert expired

afd4bcb

Reviewed-by: xuelei

8282428: ProblemList jdk/jfr/jvm/TestWaste.java

630ad1a

Reviewed-by: mikael

8281507: Two javac tests have bad jtreg @clean tags

86723d4

Reviewed-by: darcy

8267265: Use new IR Test Framework to create tests for C2 Ideal trans…

efd3967

…formations Reviewed-by: chagedorn

8230382: Clean up ConvI2L, CastII and CastLL::Ideal methods

06cadb3

Reviewed-by: thartmann, roland

8282360: Merge POSIX implementations of ThreadCritical

c58f5c6

Reviewed-by: stuefe, coleenp, dholmes

8282131: java.time.ZoneId should be a sealed abstract class

0ae3d1d

Reviewed-by: iris, rriggs, bpb, lancea, mchung, scolebourne

8282172: CompileBroker::log_metaspace_failure is called from non-Java…

4e7fb41

…/compiler threads Reviewed-by: kvn, thartmann

8277976: Break up SEQUENCE in X509Certificate::getSubjectAlternativeN…

59b3ecc

…ames and X509Certificate::getIssuerAlternativeNames in otherName 6776681: Invalid encoding of an OtherName in X509Certificate.getAlternativeNames() Reviewed-by: mullan

8282240: Add _name field to Method for NOT_PRODUCT only

c7cd148

Reviewed-by: pchilanomate, hseigel

8275731: CDS archived enums objects are recreated at runtime

d983d10

Reviewed-by: coleenp, ccheung

8282462: Remove unnecessary use of @SuppressWarnings("preview")

9d9618a

Reviewed-by: vromero

8282464: Remove author tags from java.compiler

1f89acd

Reviewed-by: jjg

8281210: Add manpage changes for PAC-RET protection on Linux/AArch64

7743266

Reviewed-by: iklam

8282392: [zero] Build broken on AArch64

c1a28aa

Reviewed-by: aph, shade

e1iu force-pushed the vectorapi-n2 branch from dba8868 to fbb743e Compare March 1, 2022 08:40

e1iu closed this Mar 2, 2022

e1iu deleted the vectorapi-n2 branch July 10, 2023 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Internal review for the performance of VectorAPI on N2 #1

Internal review for the performance of VectorAPI on N2 #1

Uh oh!

e1iu commented Jan 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

66 participants

Internal review for the performance of VectorAPI on N2 #1

Internal review for the performance of VectorAPI on N2 #1

Uh oh!

Conversation

e1iu commented Jan 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

66 participants