Skip to content
This repository was archived by the owner on Jul 1, 2023. It is now read-only.
This repository was archived by the owner on Jul 1, 2023. It is now read-only.

Compiling on release mode #1189

Closed
Closed
@philipturner

Description

@philipturner

Due to the errors with linking S4TF to an arbitrary Swift executable (#1185 (comment)), I am currently very constrained with how I can test code that imports S4TF. For now, my only option is to replace the Swift package tests with custom code I want to execute. Having to re-build S4TF repeatedly presents a bottleneck to my workflow.

I profiled S4TF build times on Google Colab (dual-core x64), and found out some interesting results. When running swift test, it always re-compiles your code, even if you compiled it previously via swift build. There is only one exception - when both swift build and swift test are in debug mode, it avoids redundantly re-compiling. This speedup does not apply when both are -Onone release, the option that compiles most quickly otherwise.

  • Pre-build as release (-Onone) (excluding tests): 1 min 51 sec
    • Build tests as release (-Onone): 2 min 29 sec (everything)
      • Extrapolated time if excluding tests: 1 min 50 sec
    • Build tests as debug: 3 min 50 sec
      • Extrapolated time if excluding tests: 3 min 0 sec
  • Pre-build as debug (excluding tests): 3 min 0 sec
    • Build tests as release (-Onone): 2 min 48 sec (everything)
      • Extrapolated time if excluding tests: 2 min 7 sec
    • Build tests as debug: 57 sec
      • Extrapolated time if excluding tests: 0 sec

If I can find a way to import S4TF outside of its tests, compiling with unoptimized release seems to be the wisest option. That would take around 2 minutes. I could add a special command to Swift-Colab that caches the Swift package build products folder. When you restart the runtime (I do that often), it would link against the build products instead of re-compiling. It would also cache the x10 binaries so you only download them from the network once. This Colab command would be implemented once there is a Swift toolchain that both runs S4TF and has the Python LLDB API.

I previously heard that there were some performance concerns with not compiling S4TF with full optimization. There are tight loops where using debug mode could cause a bottleneck, but where do these loops happen? If they are in CTensorFlow, then it doesn't matter how S4TF is compiled because CTensorFlow is pre-compiled in the x10 binary.

When I tried compiling S4TF in fully optimized release mode, I got the compiler crash caused by BatchNorm, which is currently unsolved. The crash logs are in the Colab notebooks attached below. This crash did not happen in release when the -Onone flag was set - does that behavior reveal anything new about the bug?
crash_no_tests.ipynb.zip
crash_with_tests.ipynb.zip

I am compiling using the 2021-11-12 toolchain instead of the newest toolchain (2022-01-06). Newer toolchains (starting with 2021-12-23 or earlier) introduce a bug that prevents S4TF from compiling even in debug mode (#1184 (comment)).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions