Skip to content

Program is several times slower when compiled with swiftc than when compiled with XCode due to retain&release calls #70745

Open
@yakubin

Description

@yakubin

Description

The same program when compiled with swiftc on the command line is several times slower than when compiled with XCode. It also makes many, many more calls to swift_retain than when compiled by XCode.

Reproduction

I’ve recently made a Swift program several times faster by changing one class into a struct. Here is the class version.

Judging from the CPU profile gathered by Instruments.app, the slow down prior to the change was mostly caused by release&retain calls:

Screenshot 2024-01-04 at 20 41 15

Those calls all but disappeared from the profile after changing the Sudoku class into a struct. The timings:

class:

$ time ./sudoku >/dev/null
./sudoku > /dev/null  9.72s user 0.03s system 99% cpu 9.771 total

struct:

$ time ./sudoku >/dev/null
./sudoku > /dev/null  1.81s user 0.00s system 99% cpu 1.815 total

The command I used to compile the program:

swiftc -Ounchecked sudoku.swift

What’s surprising is that if I create a new CLI app project in XCode, paste the same source code (with the class version of the code) into main.swift and build the project in Release mode, then its timing becomes:

$ time ./Sudoku >/dev/null
./Sudoku > /dev/null  2.86s user 0.01s system 99% cpu 2.874 total

That’s a lot faster than the version compiled with the swiftc command. If I compile the struct version with -O instead of -Ounchecked (because XCode in Release mode uses -O), then it turns out that the struct version compiled with swiftc is not much faster than the class version compiled with XCode:

$ time ./sudoku >/dev/null
./sudoku > /dev/null  2.47s user 0.01s system 99% cpu 2.481 total

The difference in the number of retain&release calls can also be observed by running DTrace against the struct and class versions:

First save the class version of this code in sudoku-class.swift, the struct version in sudoku-struct.swift and the final class version in sudoku-final-class.swift. Then:

sudo -v # just to refresh the sudo password before the next command
for ver in sudoku-struct sudoku-class sudoku-final-class; do echo -e "\n=========\nVERSION: $ver\n=========\n"; swiftc -Ounchecked "$ver.swift"; time sudo dtrace -c "./$ver" -n 'pid$target:libswiftCore.dylib:swift_retain:entry { @[probefunc] = count(); } profile:::tick-60s { printf("\nTIMEOUT\n"); exit(0); }'; done

Results:


=========
VERSION: sudoku-struct
=========

dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target:libswiftCore.dylib:swift_retain:entry ' matched 2 probes
done
4000
dtrace: pid 67148 has exited

  swift_retain                                                 292003
sudo dtrace -c "./$ver" -n   1.96s user 0.79s system 97% cpu 2.829 total

=========
VERSION: sudoku-class
=========

dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target:libswiftCore.dylib:swift_retain:entry ' matched 2 probes
CPU     ID                    FUNCTION:NAME
  4   6248                        :tick-60s 
TIMEOUT


  swift_retain                                               21329236
sudo dtrace -c "./$ver" -n   5.58s user 54.57s system 99% cpu 1:00.22 total

=========
VERSION: sudoku-final-class
=========

dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target:libswiftCore.dylib:swift_retain:entry ' matched 2 probes
done
4000
dtrace: pid 67162 has exited

  swift_retain                                                 288003
sudo dtrace -c "./$ver" -n   2.30s user 0.78s system 90% cpu 3.404 total

The struct version made 292003 calls to swift_retain and successfully finished in under 3 seconds. The final class version made 288003 calls and successfully finished in ~3.5 seconds. The non-final class version made a whopping 21329236 calls to swift_retain and the DTrace script terminated the program after a full minute timeout. Without the timeout, this version is happy to run for 20 minutes or more when swift_retain is instrumented by DTrace (kinda to be expected, when the instrumented function is called so many times).

Now let’s look at the class version compiled with XCode:

time sudo dtrace -c ./Sudoku -n 'pid$target:libswiftCore.dylib:swift_retain:entry { @[probefunc] = count(); } profile:::tick-60s { printf("\nTIMEOUT\n"); exit(0); }'
dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target:libswiftCore.dylib:swift_retain:entry ' matched 2 probes
done
4000
dtrace: pid 67175 has exited

  swift_retain                                                 288003
sudo dtrace -c ./Sudoku -n   3.00s user 0.79s system 99% cpu 3.801 total

288003 calls to swift_retain and successful exit after 3.8 seconds. Hmm... Those numbers seem awfully similar to the final class version compiled with swiftc. The numer of calls to swift_retain matches exactly.

So I suspect that the Swift compiler invoked by XCode noticed that the class is final even without the help of an annotation and somehow used it to optimise away a load of retain&release calls, while bare swiftc failed to do so.

I’ve tried looking at the XCode build log to check the flags passed to the compiler, but XCode uses swift-frontend instead of swiftc and many of the flags there are invalid for swiftc. The ones that I tried which worked didn’t make a difference.

Expected behavior

class version of this program compiled by XCode and swiftc should take a similar time to execute and make a similar number of calls to swift_retain. But the binary built by swiftc is several times slower and makes incomparably more calls to swift_retain than the one built by XCode.

Environment

$ swiftc --version
swift-driver version: 1.87.3 Apple Swift version 5.9.2 (swiftlang-5.9.2.2.56 clang-1500.1.0.2.5)
Target: arm64-apple-macosx13.0

XCode version: 15.1

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    ARCFeature: automatic reference countingSILOptimizerArea → compiler: SIL optimization passesbugA deviation from expected or documented behavior. Also: expected but undesirable behavior.classFeature → type declarations: Class declarationscompilerThe Swift compiler itselfdeclarationsFeature: declarationsperformancerun-time performancetype declarationsFeature → declarations: Type declarationsunexpected behaviorBug: Unexpected behavior or incorrect output

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions