Skip to content

[String] Last-minute ABI adjustment: 4-bit discriminator #21310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 20, 2018

Conversation

milseman
Copy link
Member

@milseman milseman commented Dec 13, 2018

Switch ABI to only use 4 discriminator bits.

In anticipation of potential future HW features, e.g. armv8.5 memory
tagging, only use the high 4 bytes as discriminator bits in
_BridgeObject rather than the top 8 bits. Utilize two perf flags to
cover this instead. This requires shifting around a fair amount of
internal complexity.

This includes a sizable refactoring of 32-bit support, making this kind of adjustment possible.

@milseman
Copy link
Member Author

@swift-ci please test

@milseman
Copy link
Member Author

@swift-ci please benchmark

@milseman milseman requested a review from lorentey December 13, 2018 23:53
@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - df24747010a7cd52069012f26ea571ce4c752ec3

@swift-ci
Copy link
Contributor

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Improvement
StringBuilder 374 345 -7.8% 1.08x
StringBuilderSmallReservingCapacity 385 357 -7.3% 1.08x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7316 7652 +4.6% 0.96x
Improvement
InsertCharacter.o 5176 5096 -1.5% 1.02x
StringMatch.o 4606 4541 -1.4% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
CountAlgoString 1815 2000 +10.2% 0.91x
Improvement
CharIndexing_tweet_unicodeScalars_Backwards 19777 9503 -51.9% 2.08x
CharIndexing_ascii_unicodeScalars_Backwards 10052 4862 -51.6% 2.07x
CharIndexing_tweet_unicodeScalars 19426 9459 -51.3% 2.05x
CharIndexing_ascii_unicodeScalars 9858 4840 -50.9% 2.04x
CharIndexing_russian_unicodeScalars 9907 5023 -49.3% 1.97x
CharIndexing_punctuated_unicodeScalars 2380 1240 -47.9% 1.92x
CharIndexing_russian_unicodeScalars_Backwards 9849 5237 -46.8% 1.88x
CharIndexing_korean_unicodeScalars_Backwards 12048 6415 -46.8% 1.88x
CharIndexing_korean_unicodeScalars 11528 6177 -46.4% 1.87x
CharIndexing_chinese_unicodeScalars 8527 4643 -45.5% 1.84x
CharIndexing_punctuated_unicodeScalars_Backwards 2448 1360 -44.4% 1.80x
CharIndexing_chinese_unicodeScalars_Backwards 8911 5068 -43.1% 1.76x
CharIndexing_japanese_unicodeScalars 13324 7708 -42.1% 1.73x
CharIndexing_japanese_unicodeScalars_Backwards 14132 8354 -40.9% 1.69x
CharIndexing_punctuatedJapanese_unicodeScalars 2062 1280 -37.9% 1.61x
CharIndexing_punctuatedJapanese_unicodeScalars_Backwards 2188 1372 -37.3% 1.59x
CharIndexing_utf16_unicodeScalars 11052 7335 -33.6% 1.51x
CharIndexing_utf16_unicodeScalars_Backwards 12013 8306 -30.9% 1.45x
CharIteration_russian_unicodeScalars_Backwards 5757 5282 -8.3% 1.09x
StringBuilderSmallReservingCapacity 380 351 -7.6% 1.08x
StringBuilder 368 340 -7.6% 1.08x
UTF8Decode_InitFromBytes_ascii 502 465 -7.4% 1.08x
StringAdder 468 436 -6.8% 1.07x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7206 7534 +4.6% 0.96x
Improvement
StringMatch.o 4649 4529 -2.6% 1.03x
StringWalk.o 35202 34634 -1.6% 1.02x
CSVParsing.o 30681 30273 -1.3% 1.01x
InsertCharacter.o 4815 4751 -1.3% 1.01x
StringEdits.o 12014 11862 -1.3% 1.01x
ReversedCollections.o 11842 11706 -1.1% 1.01x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------

@milseman
Copy link
Member Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - df24747010a7cd52069012f26ea571ce4c752ec3

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - df24747010a7cd52069012f26ea571ce4c752ec3

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplifications!

@milseman
Copy link
Member Author

Trying an asserts build locally to repro. Seems to just be a few _internalInvariant checks that need updating.

@milseman
Copy link
Member Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 0f12e0f0b8728e20341f360f4586348301a44dc2

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 0f12e0f0b8728e20341f360f4586348301a44dc2

@milseman milseman changed the title [DO NOT MERGE] WIP: Refactorings in preparation for some ABI changes [String] Last-minute ABI adjustment: 4-bit discriminator Dec 17, 2018
@milseman
Copy link
Member Author

@swift-ci please benchmark

@milseman
Copy link
Member Author

@swift-ci please test

@milseman
Copy link
Member Author

@lorentey can you take a look? In addition to the refactoring, this has ABI changes to only use 4-bit discriminators.

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 1096140

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 1096140

self._storage.0,
self._storage.1 & _StringObject.Nibbles.largeAddressMask)
}
return (self._storage.0, self._storage.1 & 0x00FF_FFFF_FFFF_FFFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Losing the nice explanatory name on the mask here makes me a bit sad

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but that name was misleading and conflating. This is a mask only for the small-string contents, which no longer aligns with addresses.

I thought about adding a static property for this, which would be inlinable and in our ABI, and only used in this one place. I'm not sure how to weigh the "at least it has an name" with "one level of cognitive indirection" if there's only one use site.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A local name could help readability here. let smallStringCodeUnitsMask = 0x00FF_FFFF_FFFF_FFFF or whatever. An actual static property is probably overkill.

@@ -507,7 +510,7 @@ extension _StringStorage {
@_effects(releasenone)
private func _postRRCAdjust(newCount: Int, newIsASCII: Bool) {
let countAndFlags = CountAndFlags(
count: newCount, isASCII: newIsASCII)
mortalCount: newCount, isASCII: newIsASCII)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"mortalCount" is the most metal variable name I've seen in a while

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤘

@@ -281,9 +281,13 @@ extension String {
}
}
if _guts._object.isImmortal {
// TODO: We'd rather emit a valid ObjC object statically than create a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a bug tracking this? I should either CC on it or steal it if so

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do, then I assigned it to you ;-)

return _SharedStringStorage(
immortal: _guts._object.fastUTF8.baseAddress!,
countAndFlags: _guts._object._countAndFlags)
countAndFlags: _StringObject.CountAndFlags(
sharedCount: _guts.count, isASCII: gutsCountAndFlags.isASCII))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reasoning for making a new CountAndFlags here rather than using gutsCountAndFlags? Just Shared Strings needing some different flags?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared strings do not have tail-allocated contents, which is one of the new perf flags, so we need to strip it. I very slightly prefer using initializers than mutation, because it makes coding style more uniform for things like _invariantCheck() not being called on temporarily-invalid states.

@milseman
Copy link
Member Author

@swift-ci
Copy link
Contributor

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
CharIndexing_chinese_unicodeScalars 4109 4964 +20.8% 0.83x
CharIteration_tweet_unicodeScalars_Backwards 8095 9723 +20.1% 0.83x
CharIteration_ascii_unicodeScalars_Backwards 4080 4894 +20.0% 0.83x
CharIndexing_japanese_unicodeScalars 6795 8104 +19.3% 0.84x
CharIndexing_ascii_unicodeScalars 4361 5189 +19.0% 0.84x
CharIndexing_russian_unicodeScalars 4932 5866 +18.9% 0.84x
CharIteration_punctuated_unicodeScalars_Backwards 984 1153 +17.2% 0.85x
CharIndexing_punctuatedJapanese_unicodeScalars 1128 1320 +17.0% 0.85x
CharIndexing_punctuated_unicodeScalars 1133 1310 +15.6% 0.86x
CharIteration_russian_unicodeScalars 2705 3116 +15.2% 0.87x
CharIndexing_tweet_unicodeScalars 8809 10144 +15.2% 0.87x
CharIndexing_tweet_unicodeScalars_Backwards 8545 9839 +15.1% 0.87x
CharIndexing_ascii_unicodeScalars_Backwards 4364 5010 +14.8% 0.87x
CharIteration_chinese_unicodeScalars_Backwards 4127 4729 +14.6% 0.87x
CharIndexing_utf16_unicodeScalars 6445 7373 +14.4% 0.87x
CharIteration_chinese_unicodeScalars 2398 2720 +13.4% 0.88x
CharIteration_russian_unicodeScalars_Backwards 4613 5209 +12.9% 0.89x
CharIndexing_korean_unicodeScalars 5847 6584 +12.6% 0.89x
CharIteration_korean_unicodeScalars 3141 3518 +12.0% 0.89x
CharIteration_korean_unicodeScalars_Backwards 5584 6243 +11.8% 0.89x
CharIteration_punctuatedJapanese_unicodeScalars_Backwards 1046 1163 +11.2% 0.90x
CharIteration_japanese_unicodeScalars_Backwards 7044 7781 +10.5% 0.91x
CharIndexing_punctuated_unicodeScalars_Backwards 1133 1251 +10.4% 0.91x (?)
CharIteration_utf16_unicodeScalars_Backwards 7009 7653 +9.2% 0.92x
CharIndexing_chinese_unicodeScalars_Backwards 4282 4661 +8.9% 0.92x
CharIteration_utf16_unicodeScalars 3550 3857 +8.6% 0.92x
CharIndexing_russian_unicodeScalars_Backwards 4741 5136 +8.3% 0.92x
Hanoi 3097 3350 +8.2% 0.92x
CharIteration_punctuated_unicodeScalars 605 653 +7.9% 0.93x
Improvement
Breadcrumbs.CopyUTF16CodeUnits.Mixed 60 51 -15.0% 1.18x
StringBuilder 338 298 -11.8% 1.13x
StringBuilderSmallReservingCapacity 348 307 -11.8% 1.13x
DataSetCountSmall 138 123 -10.9% 1.12x
StringUTF16Builder 360 323 -10.3% 1.11x
StringAdder 424 384 -9.4% 1.10x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7316 7700 +5.2% 0.95x
ArrayAppend.o 38744 39176 +1.1% 0.99x
Improvement
InsertCharacter.o 5176 5096 -1.5% 1.02x
StringMatch.o 4606 4541 -1.4% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
CharIteration_tweet_unicodeScalars_Backwards 8731 10047 +15.1% 0.87x
CharIteration_ascii_unicodeScalars_Backwards 4406 5068 +15.0% 0.87x
CharIteration_punctuated_unicodeScalars_Backwards 1020 1156 +13.3% 0.88x
CharIteration_ascii_unicodeScalars 2566 2892 +12.7% 0.89x
CharIteration_russian_unicodeScalars 2853 3179 +11.4% 0.90x
CharIteration_tweet_unicodeScalars 4193 4620 +10.2% 0.91x
CharIteration_korean_unicodeScalars_Backwards 5676 6214 +9.5% 0.91x
CharIteration_utf16_unicodeScalars 3447 3758 +9.0% 0.92x
CharIteration_punctuatedJapanese_unicodeScalars 625 680 +8.8% 0.92x
CharIteration_punctuated_unicodeScalars 615 666 +8.3% 0.92x
Improvement
CharIndexing_tweet_unicodeScalars_Backwards 17766 9839 -44.6% 1.81x
CharIndexing_ascii_unicodeScalars_Backwards 9026 5144 -43.0% 1.75x
CharIndexing_punctuated_unicodeScalars_Backwards 2206 1264 -42.7% 1.75x
CharIndexing_korean_unicodeScalars_Backwards 10822 6341 -41.4% 1.71x
CharIndexing_chinese_unicodeScalars_Backwards 8033 4761 -40.7% 1.69x
CharIndexing_russian_unicodeScalars 8974 5331 -40.6% 1.68x
CharIndexing_russian_unicodeScalars_Backwards 8807 5241 -40.5% 1.68x
CharIndexing_japanese_unicodeScalars_Backwards 12702 7760 -38.9% 1.64x
CharIndexing_korean_unicodeScalars 10470 6451 -38.4% 1.62x
CharIndexing_tweet_unicodeScalars 17445 10795 -38.1% 1.62x
CharIndexing_punctuated_unicodeScalars 2142 1343 -37.3% 1.59x
CharIndexing_punctuatedJapanese_unicodeScalars_Backwards 1979 1264 -36.1% 1.57x
CharIndexing_ascii_unicodeScalars 8851 5658 -36.1% 1.56x
CharIndexing_chinese_unicodeScalars 7630 4919 -35.5% 1.55x
CharIndexing_japanese_unicodeScalars 11975 7935 -33.7% 1.51x
CharIndexing_punctuatedJapanese_unicodeScalars 1855 1283 -30.8% 1.45x
CharIndexing_utf16_unicodeScalars_Backwards 10878 7815 -28.2% 1.39x
CharIndexing_utf16_unicodeScalars 9956 7400 -25.7% 1.35x
NSStringConversion 632 484 -23.4% 1.31x
DictionaryGroup 352 303 -13.9% 1.16x
StringBuilder 333 293 -12.0% 1.14x
StringBuilderSmallReservingCapacity 343 302 -12.0% 1.14x
DataSetCountSmall 141 125 -11.3% 1.13x
StringUTF16Builder 355 317 -10.7% 1.12x
StringAdder 418 379 -9.3% 1.10x
StrComplexWalk 5237 4848 -7.4% 1.08x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7206 7566 +5.0% 0.95x
Improvement
StringMatch.o 4649 4529 -2.6% 1.03x
ReversedCollections.o 9746 9610 -1.4% 1.01x
InsertCharacter.o 4815 4751 -1.3% 1.01x
StringEdits.o 12014 11862 -1.3% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Improvement
ArrayOfPOD 765 690 -9.8% 1.11x (?)

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Regression
libswiftSwiftPrivate.dylib 40960 45056 +10.0% 0.91x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 8-Core Intel Xeon E5
  Processor Speed: 3 GHz
  Number of Processors: 1
  Total Number of Cores: 8
  L2 Cache (per Core): 256 KB
  L3 Cache: 25 MB
  Memory: 64 GB
--------------

internal init(leading: UInt64, trailing: UInt64, count: Int) {
_internalInvariant(count <= _SmallString.capacity)

let isASCII = (leading | trailing) & 0x8080_8080_8080_8080 == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, nice. (I know it's not new; I just hadn't seen it before.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, given how fast this is, I wonder if the dedicated isASCII bit is even needed, or if it's worth saving for something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that fast; we have to mask off the discriminator too. While ARM can encode the constant as an immediate, but it's still more code. Specifically for small strings, ASCII-ness is by far the most common and important characteristic to know.

The other contender for this bit would be remembering both isASCII and doesn't have CR+LF, i.e. single-scalar-grapheme. But, I think it's better to check that on Character access.

  public func _forCodeGen_1() { if isASCII { foo() } else { bar() } }
  public func _forCodeGen_2() { if computeIsASCII() { foo() } else { bar() } }
_$ss12_SmallStringV13_forCodeGen_1yyF:
  tbnz       x1, 0x3e, loc_15c474

  b          _$ss12_SmallStringV3baryyFTf4d_n

loc_15c474:
  b          _$ss12_SmallStringV3fooyyFTf4d_n



_$ss12_SmallStringV13_forCodeGen_2yyF:
  orr        x8, xzr, #0x8080808080808080
  movk       x8, #0x80, lsl #48
  and        x8, x1, x8
  and        x9, x0, #0x8080808080808080
  orr        x8, x8, x9
  cbz        x8, loc_15c494

  b          _$ss12_SmallStringV3baryyFTf4d_n
loc_15c494:
  b          _$ss12_SmallStringV3fooyyFTf4d_n

Considering that this check should probably be at the top of every String API inlined into user code, it's probably worth it.

Copy link
Contributor

@jrose-apple jrose-apple Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory that could be

orr x8, x1, x0, lsl #8
and x8, x8, #0x8080808080808080
cbz x8, loc

but I take your point.

@@ -64,17 +60,22 @@ internal struct _StringObject {
as above, except bit 7 is omitted from storage -- it is left free, to supply
extra inhabitants in the StringObject structure. The missing bit can be
recovered by looking at `_variant.isImmortal`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the comment is no longer up to date, since there are not 7 bits.

NOTE: isNativelyStored is *specifically* allocated to b61 to align with the
bit-position of isSmall, which allows us to check for native storage without
an extra branch checking small-ness. Otherwise, these bits are invalid to read
from small strings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this bit. Aren't they in different words?

Copy link
Member Author

@milseman milseman Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Flags are invalid to check on small strings, since small strings are using the bits for their stored code units. Before this change, we were able to tell if we had a natively managed string with a simple comparison of the top byte. But, moving this bit to the flags means we have to check whether we are small before checking the flag bit.

I.e. !isSmall && _countAndFlags.isNativelyStored. This is another way of saying something like !bridgeObject[bit: 61] && _countAndFlags[bit: 61]. Since both words are checking the same bit, we can do: (~bridgeObject & _countAndFlags)[bit: 61].

So, something like if _object.hasNativeStorage compiles to a simple orn and tbnz.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! Clever.

- `largeAddressBits` holds an instance of `_StringStorage`.
isTailAllocated: start of the code units is at the stored address + `nativeBias`
- `isNativelyStored` always implies `isTailAllocated`, but not vice versa
(e.g. literals)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have a canonical form for "I point directly to unbiased string data" yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have anything that directly (unbiased) points to contents.

@milseman
Copy link
Member Author

Timeouts and package failures. Going to try to repro locally, but in the mean time kick off another run.

@milseman
Copy link
Member Author

@swift-ci please test

@milseman
Copy link
Member Author

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
CharIndexing_russian_unicodeScalars 5378 6519 +21.2% 0.82x
CharIndexing_chinese_unicodeScalars 4577 5531 +20.8% 0.83x
CharIteration_tweet_unicodeScalars_Backwards 9007 10843 +20.4% 0.83x
CharIteration_ascii_unicodeScalars_Backwards 4537 5454 +20.2% 0.83x
CharIndexing_japanese_unicodeScalars 7563 9033 +19.4% 0.84x
CharIndexing_ascii_unicodeScalars 4858 5784 +19.1% 0.84x
CharIndexing_korean_unicodeScalars 6158 7298 +18.5% 0.84x
CharIteration_punctuated_unicodeScalars_Backwards 1094 1285 +17.5% 0.85x
CharIteration_russian_unicodeScalars 3015 3535 +17.2% 0.85x
CharIndexing_punctuatedJapanese_unicodeScalars 1264 1471 +16.4% 0.86x
CharIndexing_punctuated_unicodeScalars 1262 1460 +15.7% 0.86x
CharIndexing_tweet_unicodeScalars_Backwards 9516 10961 +15.2% 0.87x
CharIndexing_tweet_unicodeScalars 9825 11306 +15.1% 0.87x
CharIteration_chinese_unicodeScalars_Backwards 4581 5271 +15.1% 0.87x
CharIndexing_ascii_unicodeScalars_Backwards 4857 5582 +14.9% 0.87x
CharIndexing_utf16_unicodeScalars 7230 8218 +13.7% 0.88x
CharIteration_russian_unicodeScalars_Backwards 5112 5786 +13.2% 0.88x
CharIteration_korean_unicodeScalars_Backwards 6224 6988 +12.3% 0.89x
CharIteration_japanese_unicodeScalars_Backwards 7817 8674 +11.0% 0.90x
CharIndexing_punctuated_unicodeScalars_Backwards 1262 1394 +10.5% 0.91x
CharIndexing_russian_unicodeScalars_Backwards 5284 5814 +10.0% 0.91x
MapReduceLazyCollectionShort 31 34 +9.7% 0.91x (?)
CharIteration_utf16_unicodeScalars_Backwards 7781 8503 +9.3% 0.92x
CharIndexing_chinese_unicodeScalars_Backwards 4757 5195 +9.2% 0.92x
CharIteration_chinese_unicodeScalars 2786 3031 +8.8% 0.92x
Improvement
DataSetCountSmall 154 137 -11.0% 1.12x
EqualSubstringSubstringGenericEquatable 43 40 -7.0% 1.07x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7316 7700 +5.2% 0.95x
ArrayAppend.o 38744 39176 +1.1% 0.99x
Improvement
InsertCharacter.o 5176 5096 -1.5% 1.02x
StringMatch.o 4606 4541 -1.4% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
CharIteration_tweet_unicodeScalars_Backwards 9722 11190 +15.1% 0.87x
CharIteration_ascii_unicodeScalars_Backwards 4910 5646 +15.0% 0.87x
CharIteration_punctuated_unicodeScalars_Backwards 1136 1288 +13.4% 0.88x
CharIteration_ascii_unicodeScalars 2859 3222 +12.7% 0.89x
CharIteration_russian_unicodeScalars 3180 3557 +11.9% 0.89x
CharIteration_tweet_unicodeScalars 4680 5184 +10.8% 0.90x
CharIteration_utf16_unicodeScalars 3840 4194 +9.2% 0.92x
Breadcrumbs.UTF16ToIdxRange.longMixed 110 120 +9.1% 0.92x
CharIteration_punctuatedJapanese_unicodeScalars 697 757 +8.6% 0.92x
CharIteration_punctuated_unicodeScalars 685 742 +8.3% 0.92x
CharIteration_korean_unicodeScalars_Backwards 6357 6884 +8.3% 0.92x
Improvement
CharIndexing_tweet_unicodeScalars_Backwards 19783 10967 -44.6% 1.80x
CharIndexing_ascii_unicodeScalars_Backwards 10055 5587 -44.4% 1.80x
CharIndexing_punctuated_unicodeScalars_Backwards 2455 1411 -42.5% 1.74x
CharIndexing_korean_unicodeScalars_Backwards 12036 7063 -41.3% 1.70x
CharIndexing_chinese_unicodeScalars_Backwards 8908 5309 -40.4% 1.68x
CharIndexing_japanese_unicodeScalars_Backwards 14135 8640 -38.9% 1.64x
CharIndexing_russian_unicodeScalars_Backwards 9831 6026 -38.7% 1.63x
CharIndexing_russian_unicodeScalars 9931 6122 -38.4% 1.62x
CharIndexing_tweet_unicodeScalars 19433 12026 -38.1% 1.62x
CharIndexing_korean_unicodeScalars 11549 7164 -38.0% 1.61x
CharIndexing_ascii_unicodeScalars 9864 6142 -37.7% 1.61x
CharIndexing_punctuated_unicodeScalars 2381 1494 -37.3% 1.59x
CharIndexing_punctuatedJapanese_unicodeScalars_Backwards 2195 1412 -35.7% 1.55x
CharIndexing_chinese_unicodeScalars 8495 5554 -34.6% 1.53x
CharIndexing_japanese_unicodeScalars 13329 8841 -33.7% 1.51x
CharIndexing_punctuatedJapanese_unicodeScalars 2063 1425 -30.9% 1.45x
CharIndexing_utf16_unicodeScalars_Backwards 12112 8723 -28.0% 1.39x
CharIndexing_utf16_unicodeScalars 11272 8251 -26.8% 1.37x
StrComplexWalk 6130 5531 -9.8% 1.11x
LessSubstringSubstring 44 40 -9.1% 1.10x
EqualSubstringSubstring 43 40 -7.0% 1.07x
EqualSubstringSubstringGenericEquatable 44 41 -6.8% 1.07x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7206 7566 +5.0% 0.95x
Improvement
StringMatch.o 4649 4529 -2.6% 1.03x
ReversedCollections.o 9746 9610 -1.4% 1.01x
InsertCharacter.o 4815 4751 -1.3% 1.01x
StringEdits.o 12014 11862 -1.3% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
ArrayOfPOD 780 859 +10.1% 0.91x (?)
Improvement
EqualSubstringSubstringGenericEquatable 50 46 -8.0% 1.09x
LessSubstringSubstringGenericComparable 50 46 -8.0% 1.09x
DataSetCountSmall 217 200 -7.8% 1.08x

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Regression
libswiftSwiftPrivate.dylib 40960 45056 +10.0% 0.91x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------

@milseman
Copy link
Member Author

@swift-ci please test

@milseman
Copy link
Member Author

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - f0aa7214085c43cfb0ffd3aa9ba20676046a9b02

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - f0aa7214085c43cfb0ffd3aa9ba20676046a9b02

@milseman
Copy link
Member Author

@swift-ci please test

@milseman
Copy link
Member Author

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - a77291aefce36f831ba445afa2fabd7a5eaef7ce

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - a77291aefce36f831ba445afa2fabd7a5eaef7ce

@swift-ci
Copy link
Contributor

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
CharIndexing_russian_unicodeScalars 5360 6520 +21.6% 0.82x
CharIndexing_chinese_unicodeScalars 4560 5520 +21.1% 0.83x
CharIteration_ascii_unicodeScalars_Backwards 4520 5440 +20.4% 0.83x
CharIteration_tweet_unicodeScalars_Backwards 9000 10800 +20.0% 0.83x
CharIndexing_korean_unicodeScalars 6120 7320 +19.6% 0.84x
Breadcrumbs.CopyUTF16CodeUnits.Mixed 57 68 +19.3% 0.84x
CharIndexing_japanese_unicodeScalars 7560 9000 +19.0% 0.84x
CharIndexing_ascii_unicodeScalars 4840 5760 +19.0% 0.84x
CharIteration_punctuated_unicodeScalars_Backwards 1080 1280 +18.5% 0.84x
CharIndexing_punctuated_unicodeScalars 1240 1440 +16.1% 0.86x
CharIndexing_punctuatedJapanese_unicodeScalars 1240 1440 +16.1% 0.86x
CharIteration_russian_unicodeScalars_Backwards 5080 5880 +15.7% 0.86x
CharIndexing_tweet_unicodeScalars_Backwards 9480 10960 +15.6% 0.86x
CharIndexing_tweet_unicodeScalars 9800 11280 +15.1% 0.87x
CharIteration_chinese_unicodeScalars_Backwards 4560 5240 +14.9% 0.87x
CharIndexing_ascii_unicodeScalars_Backwards 4840 5560 +14.9% 0.87x
CharIndexing_utf16_unicodeScalars 7200 8240 +14.4% 0.87x
CharIteration_punctuated_unicodeScalars 640 720 +12.5% 0.89x
CharIteration_korean_unicodeScalars_Backwards 6240 6960 +11.5% 0.90x
CharIteration_japanese_unicodeScalars_Backwards 7800 8640 +10.8% 0.90x
CharIteration_punctuatedJapanese_unicodeScalars_Backwards 1160 1280 +10.3% 0.91x
CharIndexing_punctuated_unicodeScalars_Backwards 1240 1360 +9.7% 0.91x
MapReduceLazyCollectionShort 31 34 +9.7% 0.91x (?)
CharIteration_utf16_unicodeScalars_Backwards 7760 8480 +9.3% 0.92x
CharIteration_chinese_unicodeScalars 2760 3000 +8.7% 0.92x
CharIndexing_chinese_unicodeScalars_Backwards 4760 5160 +8.4% 0.92x
StringToDataEmpty 650 700 +7.7% 0.93x
Improvement
DataSetCountSmall 154 137 -11.0% 1.12x
EqualStringSubstring 43 40 -7.0% 1.07x
EqualSubstringSubstringGenericEquatable 43 40 -7.0% 1.07x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7476 7860 +5.1% 0.95x
ArrayAppend.o 38744 39176 +1.1% 0.99x
Improvement
StringMatch.o 4606 4525 -1.8% 1.02x
InsertCharacter.o 5176 5096 -1.5% 1.02x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
CharIteration_ascii_unicodeScalars_Backwards 4880 5640 +15.6% 0.87x
CharIteration_tweet_unicodeScalars_Backwards 9680 11160 +15.3% 0.87x
CharIteration_punctuated_unicodeScalars_Backwards 1120 1280 +14.3% 0.88x
CharIteration_ascii_unicodeScalars 2840 3200 +12.7% 0.89x
CharIteration_russian_unicodeScalars 3160 3560 +12.7% 0.89x
CharIteration_tweet_unicodeScalars 4640 5120 +10.3% 0.91x
CharIteration_utf16_unicodeScalars 3840 4160 +8.3% 0.92x
Improvement
CharIndexing_ascii_unicodeScalars_Backwards 10040 5560 -44.6% 1.81x
CharIndexing_tweet_unicodeScalars_Backwards 19760 10960 -44.5% 1.80x
CharIndexing_punctuated_unicodeScalars_Backwards 2440 1400 -42.6% 1.74x
CharIndexing_korean_unicodeScalars_Backwards 12120 7040 -41.9% 1.72x
CharIndexing_russian_unicodeScalars_Backwards 9840 5800 -41.1% 1.70x
CharIndexing_chinese_unicodeScalars_Backwards 8920 5280 -40.8% 1.69x
CharIndexing_russian_unicodeScalars 9920 5920 -40.3% 1.68x
CharIndexing_japanese_unicodeScalars_Backwards 14120 8640 -38.8% 1.63x
CharIndexing_tweet_unicodeScalars 19400 12000 -38.1% 1.62x
CharIndexing_korean_unicodeScalars 11520 7160 -37.8% 1.61x
CharIndexing_ascii_unicodeScalars 9840 6120 -37.8% 1.61x
CharIndexing_punctuated_unicodeScalars 2360 1480 -37.3% 1.59x
CharIndexing_chinese_unicodeScalars 8480 5440 -35.8% 1.56x
CharIndexing_punctuatedJapanese_unicodeScalars_Backwards 2160 1400 -35.2% 1.54x
CharIndexing_japanese_unicodeScalars 13280 8800 -33.7% 1.51x
CharIndexing_punctuatedJapanese_unicodeScalars 2040 1400 -31.4% 1.46x
CharIndexing_utf16_unicodeScalars_Backwards 12080 8680 -28.1% 1.39x
CharIndexing_utf16_unicodeScalars 11320 8280 -26.9% 1.37x
DataSetCountSmall 157 140 -10.8% 1.12x
StringBuilderSmallReservingCapacity 395 358 -9.4% 1.10x
StrComplexWalk 6100 5530 -9.3% 1.10x
StringBuilder 385 350 -9.1% 1.10x
LessSubstringSubstring 44 40 -9.1% 1.10x
EqualSubstringSubstringGenericEquatable 44 40 -9.1% 1.10x
EqualSubstringSubstring 43 40 -7.0% 1.07x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringBuilder.o 7398 7758 +4.9% 0.95x
Improvement
StringMatch.o 4585 4513 -1.6% 1.02x
ReversedCollections.o 9746 9610 -1.4% 1.01x
InsertCharacter.o 4815 4751 -1.3% 1.01x
StringEdits.o 11982 11830 -1.3% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
CharIteration_ascii_unicodeScalars 262160 357400 +36.3% 0.73x
ArrayOfPOD 781 858 +9.9% 0.91x (?)
Improvement
EqualSubstringSubstringGenericEquatable 50 46 -8.0% 1.09x
DataSetCountSmall 217 200 -7.8% 1.08x

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Regression
libswiftSwiftPrivate.dylib 40960 45056 +10.0% 0.91x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - f0aa7214085c43cfb0ffd3aa9ba20676046a9b02

@milseman
Copy link
Member Author

So yeah, need to update LLDB

Remove Discriminator, Flags, etc., abstractions from
StringObject. These cause code divergence between 32-bit and 64-bit
ABI, complicate ABI changes, and otherwise contribute to bloat.
In anticipation of potential future HW features, e.g. armv8.5 memory
tagging, only use the high 4 bytes as discriminator bits in
_BridgeObject rather than the top 8 bits. Utilize two perf flags to
cover this instead. This requires shifting around a fair amount of
internal complexity.
@milseman
Copy link
Member Author

Please test with following pull request:
apple/swift-lldb#1168

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 26df40cbfea4692f708cca7c02b458a1f84bf905

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 26df40cbfea4692f708cca7c02b458a1f84bf905

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants