[benchmark] Existential Redux #20666

palimondo · 2018-11-17T19:24:27Z

This PR reintroduces the Existential benchmark family with 108 performance tests. It tests the performance of varying number of non-mutating and mutating method calls on instances of existential types of growing size and arrays thereof (12 groups x 9 variants).

It was previously marked as unstable and was not executed as part of the pre-commit suite. The root of the instability was too big inner loop multiplier (5_000_000 and 5_000), which meant the workload was very susceptible to accumulation of measurement error caused by context switching. Lowering the inner loop multipliers so that each benchmark runs in under 1000 μs results in clearly defined runtimes when measured with More Robust Benchmark_Driver.

The benchmarks were renamed according to naming convention proposed in #20334, as the original names were among the longest in the SBS. For example ExistentialTestPassExistentialTwoMethodCalls_ClassValueBuffer4 became Existential.Pass.method.2x.Ref4.

Smaller workload revealed significant setup overhead caused by initialization of the existential array in the Existential.Array group, which was isolated into type specific setUpFunctions. New Existential.Array.init benchmark group was added to measure this explicitly.

New BenchmarkCategory.existential was added to tag these tests. Running these 108 test in a manner equivalent to run_smoke_bench (time ./Benchmark_O --tags=existential --sample-time=0.0025 --num-samples=3) take 1.3 seconds on my 2008 MBP — properly sized benchmarks improve measurement precision and have negligible impact on the overall time it takes to run benchmark suite.

Procedural note: This refactoring was facilitated by first reverse-engineering a .gyb file that generated the original 810 LOC swift file. This allowed for systematic changes of the template used to generate the original 99 tests. Isolating the array creation into setup method enabled removal of specialized run_ functions, eliminating 144 LOC. This technique was than applied to the rest of test, eliminating another 202 LOC. The use of GYB has proven indispensable in performing this refactoring, vastly improving the code maintainability.

The refactoring is thoroughly documented in the commit descriptions, it is therefore best to review this PR by individual commits (check .gyb first, then the generated .swift for the results of the change).

palimondo · 2018-11-17T20:07:14Z

@aschwaighofer as original author of these benchmarks, could you please help me provide a description of their purpose? @atrick asked me to always add it for new tests over in #20552.

My impression is that the different variants (originally IntValueBuffer[0-4] and ClassValueBuffer[1-4], renamed to Val[0-4] and Ref[1-4]) are meant to illustrate relative cost increases as the size of the underlying Existential type grows.

From the comment in SortLargeExistentials:

// Make this struct a large struct which does not fit into the 3-word
// existential inline buffer.

The Val4 is clearly meant as a type that doesn't fit this inline buffer. What about the Ref types?
Why did you use Int for the 4th field in ClassValueBuffer4/Ref4? — this seems to lead to strange performance boost compared to Ref[2-3], on par with Val4 and Ref1 in most tests. I thought this would warrant a performance bug report, once these are merged.

palimondo · 2018-11-17T20:22:30Z

Also, I believe I've discovered a bug with releasing an existential IOU. Extracting the setup into array initialization functions did eliminate the setup overhead for all tests that do not modify the array. However, tests that mutate the array still have measurable overhead because they perform COW (I think?).

Setup Overhead

Benchmark	SO	SO %
Existential.Array.ConditionalShift.Ref1	30 μs	8.9%
Existential.Array.ConditionalShift.Ref2	54 μs	9.2%
Existential.Array.ConditionalShift.Ref3	72 μs	8.8%
Existential.Array.ConditionalShift.Ref4	112 μs	26.2%
Existential.Array.ConditionalShift.Val0	6 μs	9.0%
Existential.Array.ConditionalShift.Val1	6 μs	8.5%
Existential.Array.ConditionalShift.Val2	6 μs	9.0%
Existential.Array.ConditionalShift.Val3	6 μs	7.3%
Existential.Array.ConditionalShift.Val4	70 μs	18.0%
Existential.Array.Mutating.Ref1	30 μs	16.4%
Existential.Array.Mutating.Ref2	52 μs	25.2%
Existential.Array.Mutating.Ref3	72 μs	31.7%
Existential.Array.Mutating.Ref4	112 μs	25.0%
Existential.Array.Mutating.Val0	6 μs	5.4%
Existential.Array.Mutating.Val1	6 μs	5.0%
Existential.Array.Mutating.Val2	8 μs	5.4%
Existential.Array.Mutating.Val4	66 μs	13.4%
Existential.Array.Shift.Ref1	30 μs	13.8%
Existential.Array.Shift.Ref2	52 μs	13.2%
Existential.Array.Shift.Ref3	72 μs	13.0%
Existential.Array.Shift.Ref4	110 μs	37.7%
Existential.Array.Shift.Val0	4 μs	10.3%
Existential.Array.Shift.Val1	6 μs	14.6%
Existential.Array.Shift.Val2	4 μs	10.3%
Existential.Array.Shift.Val3	6 μs	12.8%
Existential.Array.Shift.Val4	68 μs	27.2%
Existential.Array.method.1x.Ref4	114 μs	8.9%
Existential.Array.method.1x.Val4	70 μs	5.7%
Existential.Array.method.2x.Ref4	118 μs	8.5%
Existential.Array.method.2x.Val4	68 μs	5.1%

After Setup Extraction

Benchmark	SO	SO %
Existential.Array.ConditionalShift.Val0	4 μs	6.1%
Existential.Array.ConditionalShift.Val1	4 μs	5.7%
Existential.Array.ConditionalShift.Val2	6 μs	9.0%
Existential.Array.ConditionalShift.Val3	6 μs	5.2%
Existential.Array.Mutating.Ref1	14 μs	8.4%
Existential.Array.Mutating.Ref2	20 μs	11.5%
Existential.Array.Mutating.Ref3	30 μs	16.5%
Existential.Array.Mutating.Ref4	92 μs	21.4%
Existential.Array.Mutating.Val4	70 μs	14.1%
Existential.Array.Shift.Ref1	12 μs	6.0%
Existential.Array.Shift.Ref2	22 μs	6.1%
Existential.Array.Shift.Ref3	26 μs	5.1%
Existential.Array.Shift.Ref4	12 μs	6.2%
Existential.Array.Shift.Val0	4 μs	10.5%
Existential.Array.Shift.Val1	4 μs	10.0%
Existential.Array.Shift.Val2	6 μs	15.0%
Existential.Array.Shift.Val3	4 μs	7.5%
Existential.Array.Shift.Val4	10 μs	5.2%

Note: Setup overhead under 5% of benchmark runtime is not reported.

I’ve tried to work around this by transferring ownership of the pre-initialized array with the grabArray function, so that it's uniquely referenced and doesn't have to COW, but nilling the IUO was crashing at runtime with Illegal Instruction 4:

func grabArray() -> [Existential] { // transfer array ownership to caller
  defer { array = nil }
  return array
}

This doesn't work either, and makes me think there's some strange instruction reordering going on:

func grabArray() -> [Existential] {
  guard array != nil else { fatalError("What?!?") }
  let a = array!
  array = nil
  return a
}

...because this does trigger the fatal error. Unfortunately I wasn't able to reproduce this behavior in minimal test case outside of SBS. Filed as SR-9298.

palimondo · 2018-11-26T20:59:56Z

@eeckstein can you please run benchmark here, too? 🤖 doesn't recognize me yet…
@swift-ci please benchmark

aschwaighofer · 2018-11-26T21:21:51Z

@aschwaighofer as original author of these benchmarks, could you please help me provide a description of their purpose? @atrick asked me to always add it for new tests over in #20552.

My impression is that the different variants (originally IntValueBuffer[0-4] and ClassValueBuffer[1-4], renamed to Val[0-4] and Ref[1-4]) are meant to illustrate relative cost increases as the size of the underlying Existential type grows.

From the comment in SortLargeExistentials:

// Make this struct a large struct which does not fit into the 3-word
// existential inline buffer.

The Val4 is clearly meant as a type that doesn't fit this inline buffer. What about the Ref types?
Why did you use Int for the 4th field in ClassValueBuffer4/Ref4? — this seems to lead to strange performance boost compared to Ref[2-3], on par with Val4 and Ref1 in most tests. I thought this would warrant a performance bug report, once these are merged.

The purpose of those benchmarks was to evaluate different scenarios when I moved the implementation of existentials (protocol values) to heap based copy-on-write buffers to evaluate implementation decisions.

I don't think we want to run those as part of regular performance tests.

The performance boost of ClassValueBuffer4 vs ClassValueBuffer3 is expected because copying the existential only involves copying one reference of the heap based copy-on-write buffer (outline case) that holds the struct vs copying the individual fields of the struct in the inline case of ClassValueBuffer3.

palimondo · 2018-11-26T23:25:00Z

I don't think we want to run those as part of regular performance tests.

Why not? They still shows how well the optimizer handles these cases. After this cleanup it takes 1.3 seconds to run all 108 of them in a manner equivalent to one run_smoke_bench iteration on my 2008 MBP…

eeckstein · 2018-11-28T20:42:45Z

@swift-ci benchmark

palimondo · 2019-01-21T20:13:36Z

@swift-ci please smoke test

Benchmark DistinctClassFieldAccesses was added in a strange place… See https://github.com/apple/swift/pull/18892/files#r212038958 Per discussion in in swiftlang#18892 (comment) I’m moving it to the extremely similar `ArrayInClass`, while fixing its workload size (and loop pattern) to make it relatively comparable.

Reduced base workloads. This allows precise measurements without the noise of accumulated error from unnecessarily long runtimes.

Painstakingly reverse-engineered the boilerplate generator, that produces ExistentialPerformance.swift (sans few whitespace differences), to ease maintanance (The upcoming refactoring).

Sanity refactoring: * Generate `BenchmarkInfo` in the same order as run loops. * Generate `IntValueBuffer0`, too. * Minor code formatting tweeks.

Renamed tests according to the naming convention proposed in PR swiftlang#20334. They now fit the 40 character limit and are structured into groups and variants. Also extracted tags into 2 constants, to simplify the [BenchmarkInfo] expression.

Existential.Array group had setup overhead caused by initialization of the existential array. Since this seems to be quite substatial and is dependant on the existential type, it makes sense to add Array.init benchmark group that will measure it explicitly. Array setup is extracted into 9 type-specific functions. The setup inside the `run_` functions now grabs that array, prepared in `setUpFunction`, excluding the initialization from the measurement. This helped with extracting setup overhead from most cases, but it appears that the mutable test still have measurable overhead because they perform COW. I’ve tried to work around this by transfering ownership of the pre-initialized array with the `grabArray` function, but nilling the IUO was crashing at runtime. This should be fixed later.

Break up BenchmarkInfo into 2 lines to fit 80 column line limit.

Benchmarks from Array group (except the `init`) don’t use the `withType` parameter from the `run_` function anymore. The type specific variation is taken care of in the `BenchmarkInfo`, since the `existentialArray` with appropriate type is created by the `setUpFunction`. This means we can reuse the same non-generic `run_ ` function for the whole group and eliminate all the specialized `runFunction` variants. This eliminates 144 lines of boilerplate.

The technique from preceding commit, can be used to fully determine the tested type variant in the `setUpFunction` and use a non-generic `runFunction`s for all the benchmarks. This eliminates 202 lines of boilerplate.

Bumping up the multipliers to get above 20 μs runtime.

palimondo · 2019-02-07T16:00:36Z

@swift-ci Please benchmark

swift-ci · 2019-02-07T16:50:57Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayAppendUTF16	2754	3128	+13.6%	0.88x (?)
ArrayAppendLatin1	2856	3230	+13.1%	0.88x (?)
ArrayAppendAscii	2788	3128	+12.2%	0.89x (?)
Added
Existential.Array.ConditionalShift.Ref1	200	202	201	—
Existential.Array.ConditionalShift.Ref2	339	339	339	—
Existential.Array.ConditionalShift.Ref3	491	491	491	—
Existential.Array.ConditionalShift.Ref4	209	216	211	—
Existential.Array.ConditionalShift.Val0	66	67	66	—
Existential.Array.ConditionalShift.Val1	68	70	69	—
Existential.Array.ConditionalShift.Val2	66	67	66	—
Existential.Array.ConditionalShift.Val3	66	67	66	—
Existential.Array.ConditionalShift.Val4	210	213	211	—
Existential.Array.Mutating.Ref1	55	57	56	—
Existential.Array.Mutating.Ref2	56	57	56	—
Existential.Array.Mutating.Ref3	56	57	56	—
Existential.Array.Mutating.Ref4	116	118	117	—
Existential.Array.Mutating.Val0	41	42	41	—
Existential.Array.Mutating.Val1	44	45	44	—
Existential.Array.Mutating.Val2	52	53	52	—
Existential.Array.Mutating.Val3	63	64	63	—
Existential.Array.Mutating.Val4	146	149	147	—
Existential.Array.Shift.Ref1	131	133	132	—
Existential.Array.Shift.Ref2	225	229	226	—
Existential.Array.Shift.Ref3	335	335	335	—
Existential.Array.Shift.Ref4	132	134	133	—
Existential.Array.Shift.Val0	58	59	58	—
Existential.Array.Shift.Val1	58	59	58	—
Existential.Array.Shift.Val2	58	59	58	—
Existential.Array.Shift.Val3	58	59	58	—
Existential.Array.Shift.Val4	134	135	134	—
Existential.Array.init.Ref1	293	296	294	—
Existential.Array.init.Ref2	474	474	474	—
Existential.Array.init.Ref3	709	709	709	—
Existential.Array.init.Ref4	337	345	340	—
Existential.Array.init.Val0	95	97	96	—
Existential.Array.init.Val1	95	101	98	—
Existential.Array.init.Val2	87	89	88	—
Existential.Array.init.Val3	87	89	88	—
Existential.Array.init.Val4	292	297	294	—
Existential.Array.method.1x.Ref1	283	285	284	—
Existential.Array.method.1x.Ref2	464	464	464	—
Existential.Array.method.1x.Ref3	686	686	686	—
Existential.Array.method.1x.Ref4	284	288	285	—
Existential.Array.method.1x.Val0	122	125	123	—
Existential.Array.method.1x.Val1	127	129	128	—
Existential.Array.method.1x.Val2	123	125	124	—
Existential.Array.method.1x.Val3	127	129	128	—
Existential.Array.method.1x.Val4	284	286	285	—
Existential.Array.method.2x.Ref1	309	311	310	—
Existential.Array.method.2x.Ref2	492	492	492	—
Existential.Array.method.2x.Ref3	711	711	711	—
Existential.Array.method.2x.Ref4	313	317	314	—
Existential.Array.method.2x.Val0	159	163	161	—
Existential.Array.method.2x.Val1	159	162	160	—
Existential.Array.method.2x.Val2	159	162	160	—
Existential.Array.method.2x.Val3	162	164	163	—
Existential.Array.method.2x.Val4	316	318	317	—
Existential.Mutating.Ref1	40	41	40	—
Existential.Mutating.Ref2	40	41	40	—
Existential.Mutating.Ref3	40	40	40	—
Existential.Mutating.Ref4	88	90	89	—
Existential.Mutating.Val0	31	32	31	—
Existential.Mutating.Val1	34	34	34	—
Existential.Mutating.Val2	40	40	40	—
Existential.Mutating.Val3	45	46	45	—
Existential.Mutating.Val4	111	113	112	—
Existential.MutatingAndNonMutating.Ref1	225	227	226	—
Existential.MutatingAndNonMutating.Ref2	377	377	377	—
Existential.MutatingAndNonMutating.Ref3	548	558	551	—
Existential.MutatingAndNonMutating.Ref4	282	284	283	—
Existential.MutatingAndNonMutating.Val0	105	107	106	—
Existential.MutatingAndNonMutating.Val1	108	110	109	—
Existential.MutatingAndNonMutating.Val2	111	120	114	—
Existential.MutatingAndNonMutating.Val3	117	119	118	—
Existential.MutatingAndNonMutating.Val4	312	315	313	—
Existential.Pass.method.1x.Ref1	165	167	166	—
Existential.Pass.method.1x.Ref2	165	167	166	—
Existential.Pass.method.1x.Ref3	154	156	155	—
Existential.Pass.method.1x.Ref4	177	179	178	—
Existential.Pass.method.1x.Val0	154	156	155	—
Existential.Pass.method.1x.Val1	154	157	155	—
Existential.Pass.method.1x.Val2	154	157	155	—
Existential.Pass.method.1x.Val3	154	156	155	—
Existential.Pass.method.1x.Val4	177	180	178	—
Existential.Pass.method.2x.Ref1	302	304	303	—
Existential.Pass.method.2x.Ref2	302	303	302	—
Existential.Pass.method.2x.Ref3	290	292	291	—
Existential.Pass.method.2x.Ref4	336	336	336	—
Existential.Pass.method.2x.Val0	263	265	264	—
Existential.Pass.method.2x.Val1	263	268	266	—
Existential.Pass.method.2x.Val2	268	271	269	—
Existential.Pass.method.2x.Val3	280	282	281	—
Existential.Pass.method.2x.Val4	331	331	331	—
Existential.method.1x.Ref1	74	75	74	—
Existential.method.1x.Ref2	74	76	75	—
Existential.method.1x.Ref3	68	70	69	—
Existential.method.1x.Ref4	80	81	80	—
Existential.method.1x.Val0	68	70	69	—
Existential.method.1x.Val1	68	73	70	—
Existential.method.1x.Val2	68	69	68	—
Existential.method.1x.Val3	68	69	69	—
Existential.method.1x.Val4	80	81	80	—
Existential.method.2x.Ref1	137	138	137	—
Existential.method.2x.Ref2	137	138	137	—
Existential.method.2x.Ref3	131	133	132	—
Existential.method.2x.Ref4	154	156	155	—
Existential.method.2x.Val0	120	122	121	—
Existential.method.2x.Val1	120	122	121	—
Existential.method.2x.Val2	120	121	120	—
Existential.method.2x.Val3	125	127	126	—
Existential.method.2x.Val4	148	151	149	—

Code size: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayInClass.o	1685	3882	+130.4%	0.43x
TestsUtils.o	24841	25345	+2.0%	0.98x
Improvement
ExistentialPerformance.o	68349	46683	-31.7%	1.46x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayAppendAscii	2584	3026	+17.1%	0.85x (?)
ArrayAppendUTF16	2584	3026	+17.1%	0.85x (?)
ArrayAppendLatin1	2686	3094	+15.2%	0.87x (?)
FlattenListLoop	4052	4425	+9.2%	0.92x (?)
Array2D	6912	7520	+8.8%	0.92x
Added
Existential.Array.ConditionalShift.Ref1	202	205	203	—
Existential.Array.ConditionalShift.Ref2	344	344	344	—
Existential.Array.ConditionalShift.Ref3	496	496	496	—
Existential.Array.ConditionalShift.Ref4	211	214	212	—
Existential.Array.ConditionalShift.Val0	73	74	73	—
Existential.Array.ConditionalShift.Val1	76	78	77	—
Existential.Array.ConditionalShift.Val2	76	77	76	—
Existential.Array.ConditionalShift.Val3	76	77	76	—
Existential.Array.ConditionalShift.Val4	213	215	214	—
Existential.Array.Mutating.Ref1	478	479	478	—
Existential.Array.Mutating.Ref2	479	479	479	—
Existential.Array.Mutating.Ref3	484	488	485	—
Existential.Array.Mutating.Ref4	533	535	534	—
Existential.Array.Mutating.Val0	41	42	41	—
Existential.Array.Mutating.Val1	178	180	179	—
Existential.Array.Mutating.Val2	354	356	355	—
Existential.Array.Mutating.Val3	512	512	512	—
Existential.Array.Mutating.Val4	689	689	689	—
Existential.Array.Shift.Ref1	131	143	135	—
Existential.Array.Shift.Ref2	227	229	228	—
Existential.Array.Shift.Ref3	337	337	337	—
Existential.Array.Shift.Ref4	134	136	135	—
Existential.Array.Shift.Val0	64	64	64	—
Existential.Array.Shift.Val1	63	65	64	—
Existential.Array.Shift.Val2	64	65	64	—
Existential.Array.Shift.Val3	63	64	63	—
Existential.Array.Shift.Val4	134	136	135	—
Existential.Array.init.Ref1	294	298	295	—
Existential.Array.init.Ref2	476	476	476	—
Existential.Array.init.Ref3	708	709	708	—
Existential.Array.init.Ref4	337	337	337	—
Existential.Array.init.Val0	98	100	99	—
Existential.Array.init.Val1	99	101	100	—
Existential.Array.init.Val2	91	94	92	—
Existential.Array.init.Val3	95	96	95	—
Existential.Array.init.Val4	291	294	292	—
Existential.Array.method.1x.Ref1	290	295	292	—
Existential.Array.method.1x.Ref2	473	474	473	—
Existential.Array.method.1x.Ref3	694	694	694	—
Existential.Array.method.1x.Ref4	295	299	296	—
Existential.Array.method.1x.Val0	130	132	131	—
Existential.Array.method.1x.Val1	144	146	145	—
Existential.Array.method.1x.Val2	144	148	146	—
Existential.Array.method.1x.Val3	144	158	150	—
Existential.Array.method.1x.Val4	294	299	296	—
Existential.Array.method.2x.Ref1	572	572	572	—
Existential.Array.method.2x.Ref2	752	752	752	—
Existential.Array.method.2x.Ref3	971	971	971	—
Existential.Array.method.2x.Ref4	557	557	557	—
Existential.Array.method.2x.Val0	166	168	167	—
Existential.Array.method.2x.Val1	181	184	182	—
Existential.Array.method.2x.Val2	195	197	196	—
Existential.Array.method.2x.Val3	199	206	202	—
Existential.Array.method.2x.Val4	338	345	340	—
Existential.Mutating.Ref1	365	365	365	—
Existential.Mutating.Ref2	365	365	365	—
Existential.Mutating.Ref3	375	376	375	—
Existential.Mutating.Ref4	414	423	418	—
Existential.Mutating.Val0	31	32	31	—
Existential.Mutating.Val1	137	138	137	—
Existential.Mutating.Val2	266	275	271	—
Existential.Mutating.Val3	390	399	396	—
Existential.Mutating.Val4	519	563	545	—
Existential.MutatingAndNonMutating.Ref1	515	522	517	—
Existential.MutatingAndNonMutating.Ref2	643	644	643	—
Existential.MutatingAndNonMutating.Ref3	814	814	814	—
Existential.MutatingAndNonMutating.Ref4	580	585	582	—
Existential.MutatingAndNonMutating.Val0	108	109	108	—
Existential.MutatingAndNonMutating.Val1	193	195	194	—
Existential.MutatingAndNonMutating.Val2	330	337	333	—
Existential.MutatingAndNonMutating.Val3	461	461	461	—
Existential.MutatingAndNonMutating.Val4	695	734	717	—
Existential.Pass.method.1x.Ref1	200	201	200	—
Existential.Pass.method.1x.Ref2	200	201	200	—
Existential.Pass.method.1x.Ref3	200	203	202	—
Existential.Pass.method.1x.Ref4	222	240	229	—
Existential.Pass.method.1x.Val0	154	155	154	—
Existential.Pass.method.1x.Val1	200	203	201	—
Existential.Pass.method.1x.Val2	200	201	200	—
Existential.Pass.method.1x.Val3	200	203	201	—
Existential.Pass.method.1x.Val4	222	226	224	—
Existential.Pass.method.2x.Ref1	954	954	954	—
Existential.Pass.method.2x.Ref2	956	959	957	—
Existential.Pass.method.2x.Ref3	954	955	954	—
Existential.Pass.method.2x.Ref4	977	977	977	—
Existential.Pass.method.2x.Val0	257	261	258	—
Existential.Pass.method.2x.Val1	302	304	303	—
Existential.Pass.method.2x.Val2	348	348	348	—
Existential.Pass.method.2x.Val3	360	362	361	—
Existential.Pass.method.2x.Val4	405	405	405	—
Existential.method.1x.Ref1	91	93	92	—
Existential.method.1x.Ref2	91	93	92	—
Existential.method.1x.Ref3	91	92	91	—
Existential.method.1x.Ref4	102	104	103	—
Existential.method.1x.Val0	68	69	68	—
Existential.method.1x.Val1	91	93	92	—
Existential.method.1x.Val2	91	92	91	—
Existential.method.1x.Val3	91	92	91	—
Existential.method.1x.Val4	102	105	103	—
Existential.method.2x.Ref1	457	457	457	—
Existential.method.2x.Ref2	457	457	457	—
Existential.method.2x.Ref3	457	457	457	—
Existential.method.2x.Ref4	468	468	468	—
Existential.method.2x.Val0	120	121	120	—
Existential.method.2x.Val1	142	144	143	—
Existential.method.2x.Val2	165	167	166	—
Existential.method.2x.Val3	171	173	172	—
Existential.method.2x.Val4	196	200	198	—

Code size: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayInClass.o	1982	4460	+125.0%	0.44x
Improvement
ExistentialPerformance.o	62685	36587	-41.6%	1.71x

Performance: -Onone

TEST	MIN	MAX	MEAN	MAX_RSS
Added
Existential.Array.ConditionalShift.Ref1	821	821	821	—
Existential.Array.ConditionalShift.Ref2	992	992	992	—
Existential.Array.ConditionalShift.Ref3	1136	1137	1136	—
Existential.Array.ConditionalShift.Ref4	809	810	809	—
Existential.Array.ConditionalShift.Val0	606	607	606	—
Existential.Array.ConditionalShift.Val1	608	608	608	—
Existential.Array.ConditionalShift.Val2	611	612	611	—
Existential.Array.ConditionalShift.Val3	614	615	614	—
Existential.Array.ConditionalShift.Val4	727	727	727	—
Existential.Array.Mutating.Ref1	1648	1731	1678	—
Existential.Array.Mutating.Ref2	1633	1714	1662	—
Existential.Array.Mutating.Ref3	1646	1728	1676	—
Existential.Array.Mutating.Ref4	1741	1811	1766	—
Existential.Array.Mutating.Val0	768	773	771	—
Existential.Array.Mutating.Val1	971	971	971	—
Existential.Array.Mutating.Val2	1171	1171	1171	—
Existential.Array.Mutating.Val3	1396	1456	1417	—
Existential.Array.Mutating.Val4	1976	2025	1994	—
Existential.Array.Shift.Ref1	2340	2399	2360	—
Existential.Array.Shift.Ref2	2399	2586	2462	—
Existential.Array.Shift.Ref3	2557	2614	2576	—
Existential.Array.Shift.Ref4	2149	2360	2219	—
Existential.Array.Shift.Val0	2211	2250	2224	—
Existential.Array.Shift.Val1	2273	2302	2283	—
Existential.Array.Shift.Val2	2220	2260	2234	—
Existential.Array.Shift.Val3	2221	2267	2236	—
Existential.Array.Shift.Val4	2149	2334	2211	—
Existential.Array.init.Ref1	314	314	314	—
Existential.Array.init.Ref2	527	527	527	—
Existential.Array.init.Ref3	736	736	736	—
Existential.Array.init.Ref4	365	365	365	—
Existential.Array.init.Val0	118	120	119	—
Existential.Array.init.Val1	122	124	123	—
Existential.Array.init.Val2	116	117	116	—
Existential.Array.init.Val3	118	123	120	—
Existential.Array.init.Val4	305	316	311	—
Existential.Array.method.1x.Ref1	5002	5011	5006	—
Existential.Array.method.1x.Ref2	5125	5541	5265	—
Existential.Array.method.1x.Ref3	5382	5418	5404	—
Existential.Array.method.1x.Ref4	4674	5029	4793	—
Existential.Array.method.1x.Val0	4463	4465	4464	—
Existential.Array.method.1x.Val1	4488	4496	4492	—
Existential.Array.method.1x.Val2	4479	4495	4486	—
Existential.Array.method.1x.Val3	4489	4501	4494	—
Existential.Array.method.1x.Val4	4362	4691	4473	—
Existential.Array.method.2x.Ref1	5725	5747	5733	—
Existential.Array.method.2x.Ref2	5807	6129	5926	—
Existential.Array.method.2x.Ref3	6149	6184	6162	—
Existential.Array.method.2x.Ref4	5411	5785	5537	—
Existential.Array.method.2x.Val0	4525	4549	4534	—
Existential.Array.method.2x.Val1	4558	4640	4586	—
Existential.Array.method.2x.Val2	4618	4635	4624	—
Existential.Array.method.2x.Val3	4654	4658	4656	—
Existential.Array.method.2x.Val4	4587	4909	4696	—
Existential.Mutating.Ref1	729	729	729	—
Existential.Mutating.Ref2	749	768	758	—
Existential.Mutating.Ref3	720	737	726	—
Existential.Mutating.Ref4	770	791	780	—
Existential.Mutating.Val0	94	96	95	—
Existential.Mutating.Val1	233	237	234	—
Existential.Mutating.Val2	406	407	406	—
Existential.Mutating.Val3	568	568	568	—
Existential.Mutating.Val4	969	970	969	—
Existential.MutatingAndNonMutating.Ref1	1156	1184	1169	—
Existential.MutatingAndNonMutating.Ref2	1347	1389	1361	—
Existential.MutatingAndNonMutating.Ref3	1488	1528	1506	—
Existential.MutatingAndNonMutating.Ref4	1258	1316	1278	—
Existential.MutatingAndNonMutating.Val0	197	200	198	—
Existential.MutatingAndNonMutating.Val1	342	342	342	—
Existential.MutatingAndNonMutating.Val2	511	517	513	—
Existential.MutatingAndNonMutating.Val3	675	679	677	—
Existential.MutatingAndNonMutating.Val4	1196	1258	1222	—
Existential.Pass.method.1x.Ref1	1217	1263	1232	—
Existential.Pass.method.1x.Ref2	1252	1266	1257	—
Existential.Pass.method.1x.Ref3	1264	1323	1284	—
Existential.Pass.method.1x.Ref4	1298	1363	1320	—
Existential.Pass.method.1x.Val0	321	322	321	—
Existential.Pass.method.1x.Val1	332	334	333	—
Existential.Pass.method.1x.Val2	332	334	333	—
Existential.Pass.method.1x.Val3	337	338	338	—
Existential.Pass.method.1x.Val4	375	375	375	—
Existential.Pass.method.2x.Ref1	3423	3484	3444	—
Existential.Pass.method.2x.Ref2	3443	3522	3487	—
Existential.Pass.method.2x.Ref3	3572	3632	3596	—
Existential.Pass.method.2x.Ref4	3647	3707	3668	—
Existential.Pass.method.2x.Val0	526	526	526	—
Existential.Pass.method.2x.Val1	551	559	554	—
Existential.Pass.method.2x.Val2	806	807	806	—
Existential.Pass.method.2x.Val3	902	902	902	—
Existential.Pass.method.2x.Val4	1076	1095	1082	—
Existential.method.1x.Ref1	623	629	626	—
Existential.method.1x.Ref2	623	641	629	—
Existential.method.1x.Ref3	640	640	640	—
Existential.method.1x.Ref4	663	663	663	—
Existential.method.1x.Val0	188	206	195	—
Existential.method.1x.Val1	194	196	195	—
Existential.method.1x.Val2	194	196	195	—
Existential.method.1x.Val3	194	197	195	—
Existential.method.1x.Val4	211	215	212	—
Existential.method.2x.Ref1	1705	1751	1720	—
Existential.method.2x.Ref2	1714	1774	1735	—
Existential.method.2x.Ref3	1795	1855	1817	—
Existential.method.2x.Ref4	1840	1889	1857	—
Existential.method.2x.Val0	280	282	281	—
Existential.method.2x.Val1	293	294	293	—
Existential.method.2x.Val2	417	417	417	—
Existential.method.2x.Val3	451	470	457	—
Existential.method.2x.Val4	542	544	543	—

✅	Benchmark Check Report
⛔️⏱	`Existential.Array.Mutating.Val4` has setup overhead of 20 μs (12.2%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}
⛔️⏱	`Existential.Array.method.2x.Val4` has setup overhead of 20 μs (6.1%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}
⛔️⏱	`Existential.Array.method.2x.Val3` has setup overhead of 10 μs (6.0%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}
⛔️⏱	`Existential.Array.Mutating.Ref4` has setup overhead of 26 μs (18.7%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}
⛔️⏱	`Existential.Array.Mutating.Ref1` has setup overhead of 4 μs (6.8%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}
⛔️⏱	`Existential.Array.Mutating.Ref2` has setup overhead of 4 μs (6.7%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}
⛔️⏱	`Existential.Array.Mutating.Ref3` has setup overhead of 6 μs (9.7%). _{Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.}

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

--------------

palimondo · 2019-02-07T17:11:10Z

@swift-ci please smoke test

palimondo · 2019-02-07T20:36:25Z

As I've explained above, the setup overhead cannot yet be completely eliminated due to SR-9298.

The whole set of 109 benchmarks adds less than 1 second to the full measured iteration of run_smoke_bench, which is totally negligible, I'm therefore strongly advocating for re-enabling this refactored benchmark family as part of the standard commit benchmark set.

@eeckstein Please review 🙏 (when you get back...)

eeckstein · 2019-02-11T20:54:24Z

As @aschwaighofer (the author of the benchmarks) already said, these benchmarks are not really there to find optimization regressions/improvements. Arnold added them to evaluate different implementation scenarios.
If we want to test if existential code generation works as expected, it's better to do that with FileCheck tests.

Even if they runtime of those benchmarks is minimal, I'm not in favor of adding benchmarks which are not really useful.

palimondo · 2019-02-11T21:12:29Z

In that case they should be removed from SBS altogether.

Edit: we should probably discuss the philosophy of extending performance coverage and build a consensus in the forums. I'll open that topic once I'm done with the janitor duty — there are still some loose ends to tie...

palimondo · 2019-02-12T19:57:53Z

@eeckstein Are you objecting to re-enabling these benchmarks or to the refactoring+renaming itself?
Would you be OK with merging this if I've replaced the .existential tag (formerly .unstable) with a .skip?

Don’t run the ExistentialPerformance benchmarks as part of the pre-commit suite.

palimondo · 2019-02-12T20:18:05Z

@swift-ci please benchmark

swift-ci · 2019-02-12T20:18:19Z

!!! Couldn't read commit file !!!

palimondo · 2019-02-12T20:19:29Z

@swift-ci please smoke test

eeckstein · 2019-02-12T20:41:07Z

Would you be OK with merging this if I've replaced the .existential tag (formerly .unstable) with a .skip?

yes, that would be fine

palimondo · 2019-02-12T20:48:50Z

@eeckstein Would you put that in the review or can I just merge once the checks are completed?

swift-ci · 2019-02-12T21:05:32Z

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
LessSubstringSubstring	39	44	+12.8%	0.89x
EqualStringSubstring	39	44	+12.8%	0.89x (?)
EqualSubstringSubstringGenericEquatable	39	44	+12.8%	0.89x
EqualSubstringString	39	44	+12.8%	0.89x
LessSubstringSubstringGenericComparable	39	44	+12.8%	0.89x
SortStringsUnicode	3255	3615	+11.1%	0.90x
EqualSubstringSubstring	40	44	+10.0%	0.91x (?)

Code size: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayInClass.o	1685	3882	+130.4%	0.43x
TestsUtils.o	24841	25345	+2.0%	0.98x
Improvement
ExistentialPerformance.o	68349	46683	-31.7%	1.46x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
EqualSubstringSubstring	39	44	+12.8%	0.89x (?)
EqualStringSubstring	39	44	+12.8%	0.89x (?)
EqualSubstringString	39	44	+12.8%	0.89x (?)
LessSubstringSubstringGenericComparable	39	44	+12.8%	0.89x
ObjectiveCBridgeStubDateAccess	228	257	+12.7%	0.89x
LessSubstringSubstring	40	44	+10.0%	0.91x
EqualSubstringSubstringGenericEquatable	40	44	+10.0%	0.91x
SortStringsUnicode	3290	3615	+9.9%	0.91x (?)
FlattenListLoop	4054	4426	+9.2%	0.92x (?)
Array2D	6912	7520	+8.8%	0.92x
Improvement
FatCompactMap	10150	9360	-7.8%	1.08x

Code size: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayInClass.o	1982	4460	+125.0%	0.44x
Improvement
ExistentialPerformance.o	62685	36587	-41.6%	1.71x
ChainedFilterMap.o	3566	3006	-15.7%	1.19x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
ObjectiveCBridgeStubFromNSDateRef	4620	5510	+19.3%	0.84x (?)
ObjectiveCBridgeStubFromNSDate	6570	7500	+14.2%	0.88x (?)
StrComplexWalk	6670	7350	+10.2%	0.91x (?)
SortStringsUnicode	4860	5325	+9.6%	0.91x
Improvement
ArrayOfGenericPOD2	1235	1065	-13.8%	1.16x (?)
ArrayOfPOD	855	779	-8.9%	1.10x (?)

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

This was referenced Nov 17, 2018

[benchmark] run_smoke_bench tweaks #20667

Merged

[WIP][benchmark] Flatten - Extended test family #20552

Closed

palimondo requested review from eeckstein and aschwaighofer November 26, 2018 18:42

This comment has been minimized.

Sign in to view

palimondo changed the title ~~[WIP][benchmark] Existential Redux~~ [benchmark] Existential Redux Feb 4, 2019

palimondo added 11 commits February 6, 2019 12:38

[benchmark] Adjust Existential workload sizes

ae4ecbe

Reduced base workloads. This allows precise measurements without the noise of accumulated error from unnecessarily long runtimes.

[benchmark] GYB ExistentialPerformance

1bd702a

Painstakingly reverse-engineered the boilerplate generator, that produces ExistentialPerformance.swift (sans few whitespace differences), to ease maintanance (The upcoming refactoring).

[benchmark] Gardening: BenchmarkInfo order

5bbb9ff

Sanity refactoring: * Generate `BenchmarkInfo` in the same order as run loops. * Generate `IntValueBuffer0`, too. * Minor code formatting tweeks.

[benchmark] Existential: shorter names

06cf0c8

Renamed tests according to the naming convention proposed in PR swiftlang#20334. They now fit the 40 character limit and are structured into groups and variants. Also extracted tags into 2 constants, to simplify the [BenchmarkInfo] expression.

[benchmark] Gardening: 80 col Existential

6b29523

Break up BenchmarkInfo into 2 lines to fit 80 column line limit.

[benchmark] Refactor: Despecialized Existential

35163f1

The technique from preceding commit, can be used to fully determine the tested type variant in the `setUpFunction` and use a non-generic `runFunction`s for all the benchmarks. This eliminates 202 lines of boilerplate.

[benchmark] Add .existential BenchmarkCategory

0ac591e

[benchmark] Add docs & adjust loop multipliers

b61a63d

Bumping up the multipliers to get above 20 μs runtime.

palimondo force-pushed the i-just-do-eyes branch from 31b5b44 to b61a63d Compare February 7, 2019 14:15

[benchmark] Disable ExistentialPerformance

776ace0

Don’t run the ExistentialPerformance benchmarks as part of the pre-commit suite.

eeckstein approved these changes Feb 12, 2019

View reviewed changes

palimondo merged commit cac8363 into swiftlang:master Feb 13, 2019

palimondo mentioned this pull request Apr 30, 2019

Enable ExistentialSpecializer by default #19820

Merged

palimondo deleted the i-just-do-eyes branch May 6, 2019 09:35

This was referenced Dec 6, 2018

[SR-9184] UTF-8 String Regressions #51675

Open

[SR-9298] Crash with IOU and Existentials #51769

Closed

[benchmark] Existential Redux #20666

[benchmark] Existential Redux #20666

Uh oh!

Conversation

palimondo commented Nov 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

palimondo commented Nov 17, 2018

Uh oh!

palimondo commented Nov 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Setup Overhead

After Setup Extraction

Uh oh!

palimondo commented Nov 26, 2018

Uh oh!

aschwaighofer commented Nov 26, 2018

Uh oh!

palimondo commented Nov 26, 2018

Uh oh!

eeckstein commented Nov 28, 2018

Uh oh!

This comment has been minimized.

palimondo commented Jan 21, 2019

Uh oh!

palimondo commented Feb 7, 2019

Uh oh!

swift-ci commented Feb 7, 2019

Build comment file:

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Uh oh!

palimondo commented Feb 7, 2019

Uh oh!

palimondo commented Feb 7, 2019

Uh oh!

eeckstein commented Feb 11, 2019

Uh oh!

palimondo commented Feb 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

palimondo commented Feb 12, 2019

Uh oh!

palimondo commented Feb 12, 2019

Uh oh!

swift-ci commented Feb 12, 2019

Uh oh!

palimondo commented Feb 12, 2019

Uh oh!

eeckstein commented Feb 12, 2019

Uh oh!

palimondo commented Feb 12, 2019

Uh oh!

swift-ci commented Feb 12, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Uh oh!

Uh oh!

palimondo commented Nov 17, 2018 •

edited

Loading

palimondo commented Nov 17, 2018 •

edited

Loading

palimondo commented Feb 11, 2019 •

edited

Loading