Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove environment name from build_cache_dir cache keys #4574

Open
1 task done
LordMike opened this issue Mar 17, 2023 · 41 comments
Open
1 task done

Remove environment name from build_cache_dir cache keys #4574

LordMike opened this issue Mar 17, 2023 · 41 comments

Comments

@LordMike
Copy link

What kind of issue is this?

  • Feature Request.
    Start by telling us what problem you’re trying to solve. Often a solution
    already exists! Don’t send pull requests to implement new features without first getting our
    support. Sometimes we leave features out on purpose to keep the project small.

Configuration

Operating system: docker/linux, host is Debian 10, image is ghcr.io/esphome/esphome-hassio:2023.3.0
PlatformIO Version (platformio --version): PlatformIO Core, version 6.1.6

Description of problem

I'm using esphome to manage tens of mostly identical devices with identical sets of source code sans a few items such as their individual name and some encryption keys. When I use esphome to compile the firmwares for these devices, they create a separate environment in platformio for each device, which is good for isolation. It does take a long time to compile the mostly identical source code though, for all these devices, when updates are available. I'd like to alleviate that.

In another issue, I've been digging in to what can be done, and I've experimented with the build_cache_dir setting. It's not currently used by esphome. This setting works wonders for a single device: if I recompile after cleaning the output, the build is instantaneous (as can be), as it uses the cache.

The cached output from one device is not used in another, and I've found this topic (https://community.platformio.org/t/build-cache-dir-will-not-share-object-files-between-envs/10011/8) on your community forums that provides a crucial hint. It seems the env:name is part of the cache key, and as each device in esphome is a separate environment, I have my issue.

I hope this issue can lead to the env:name being removed from the cache key, to allow the build cache to be reused across environments. It's my impression that the rest of the build commandline includes all the details that are in environments, such as boards, platforms and libraries.

Steps to Reproduce

  1. Create two environments of identical board, platform, dependencies, code
  2. Set build_cache_dir to some directory
  3. Build the two environments
  4. Observe that the identical code is compiled twice

Actual Results

The cache is not reused between identical builds, if they have two environments.

Expected Results

Environment name should not affect the cache key. It's my impression that the rest of the build commandline (which is part of the key) is enough to distinguish one output from another.

If problems with PlatformIO Build System:

The content of platformio.ini:
This file is modified by me, locally, to include the cache_build_dir. It is identical to other devices I have, except for the environment name.

; Auto generated code by esphome

[common]
lib_deps =
build_flags =
upload_flags =

; ========== AUTO GENERATED CODE BEGIN ===========
[platformio]
description = ESPHome 2023.3.0
build_cache_dir = /esphome_cache/
[env:light-extra01]
board = esp8285
board_build.flash_mode = dout
board_build.ldscript = eagle.flash.1m.ld
build_flags =
    -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO
    -DNEW_OOM_ABORT
    -DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH
    -DUSE_ARDUINO
    -DUSE_ESP8266
    -DUSE_ESP8266_FRAMEWORK_ARDUINO
    -DUSE_STORE_LOG_STR_IN_FLASH
    -Wno-nonnull-compare
    -Wno-sign-compare
    -Wno-unused-but-set-variable
    -Wno-unused-variable
    -fno-exceptions
extra_scripts =
    post:post_build.py
framework = arduino
lib_deps =
    ottowinter/ESPAsyncTCP-esphome@1.2.3
    esphome/ESPAsyncWebServer-esphome@2.1.0
    DNSServer
    ESP8266WiFi
    ESP8266mDNS
    esphome/noise-c@0.1.4
    bblanchon/ArduinoJson@6.18.5
    ${common.lib_deps}
lib_ldf_mode = off
platform = platformio/espressif8266 @ 3.2.0
platform_packages =
    platformio/framework-arduinoespressif8266 @ ~3.30002.0
; =========== AUTO GENERATED CODE END ============

Additional info

@ivankravets
Copy link
Member

Are you sure that build environments (flags, etc) are the same for both projects?

Please try to run both projects with pio run -v and compare VERBOSE output. The flags MUST be the same.

@LordMike
Copy link
Author

I’ll try to see if I can get a run output for you. Hopefully today.

I can confirm though, that if I remove my build output for a specific device, and run the build again in esphome for the same device, most (all iirc) of the output - even the final “firmware.elf” comes from the cache.

So my assumption is that all things are identical - except for the environment name.

@LordMike
Copy link
Author

LordMike commented Mar 18, 2023

I've not been able to figure out how to run pio directly, or how to run esphome myself. I think there may be a working directory or something I'm not aware of, because what I get does not match what I see in the logs when I trigger esphome through their dashboard. I did alter esphome though to enable verbosity, which gave me a lot more log output when building:

I've captured what I think is the relevant commandline from when I trigger a rebuild of two of my identical devices, and they're shown here:

## light-41-1
xtensa-lx106-elf-g++ -o /data/light-41-1/.pioenvs/light-41-1/src/main.cpp.o -c -fno-rtti -std=gnu++17 -fno-exceptions -Wno-nonnull-compare -Wno-sign-compare -Wno-unused-but-set-variable -Wno-unused-variable -fno-exceptions -Os -mlongcalls -mtext-section-literals -falign-functions=4 -U__STRICT_ANSI__ -D_GNU_SOURCE -ffunction-sections -fdata-sections -Wall -Werror=return-type -free -fipa-pta -DPLATFORMIO=60106 -DESP8266 -DARDUINO_ARCH_ESP8266 -DARDUINO_ESP8266_ESP01 -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO -DNEW_OOM_ABORT -DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH -DUSE_ARDUINO -DUSE_ESP8266 -DUSE_ESP8266_FRAMEWORK_ARDUINO -DUSE_STORE_LOG_STR_IN_FLASH -DF_CPU=80000000L -D__ets__ -DICACHE_FLASH -DARDUINO=10805 -DARDUINO_BOARD=\"PLATFORMIO_ESP8285\" -DFLASHMODE_DOUT -DLWIP_OPEN_SRC -DNONOSDK22x_190703=1 -DTCP_MSS=1460 -DLWIP_FEATURES=0 -DLWIP_IPV6=0 -DVTABLES_IN_FLASH -DMMU_IRAM_SIZE=0x8000 -DMMU_ICACHE_SIZE=0x8000 -Isrc -I/data/light-41-1/.piolibdeps/light-41-1/ArduinoJson/src -I/data/light-41-1/.piolibdeps/light-41-1/noise-c/include -I/data/light-41-1/.piolibdeps/light-41-1/noise-c/src -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/libsodium/src/libsodium/include -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/libsodium/src/libsodium -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/libsodium/src/libsodium/include/sodium -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/port_include -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/ESP8266mDNS/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/DNSServer/src -I/data/light-41-1/.piolibdeps/light-41-1/ESPAsyncWebServer-esphome/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/ESP8266WiFi/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/Hash/src -I/data/light-41-1/.piolibdeps/light-41-1/ESPAsyncTCP-esphome/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/tools/sdk/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/cores/esp8266 -I/data/cache/platformio/packages/toolchain-xtensa/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/tools/sdk/lwip2/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/variants/generic src/main.cpp

## light-41-2
xtensa-lx106-elf-g++ -o /data/light-41-2/.pioenvs/light-41-2/src/main.cpp.o -c -fno-rtti -std=gnu++17 -fno-exceptions -Wno-nonnull-compare -Wno-sign-compare -Wno-unused-but-set-variable -Wno-unused-variable -fno-exceptions -Os -mlongcalls -mtext-section-literals -falign-functions=4 -U__STRICT_ANSI__ -D_GNU_SOURCE -ffunction-sections -fdata-sections -Wall -Werror=return-type -free -fipa-pta -DPLATFORMIO=60106 -DESP8266 -DARDUINO_ARCH_ESP8266 -DARDUINO_ESP8266_ESP01 -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO -DNEW_OOM_ABORT -DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH -DUSE_ARDUINO -DUSE_ESP8266 -DUSE_ESP8266_FRAMEWORK_ARDUINO -DUSE_STORE_LOG_STR_IN_FLASH -DF_CPU=80000000L -D__ets__ -DICACHE_FLASH -DARDUINO=10805 -DARDUINO_BOARD=\"PLATFORMIO_ESP8285\" -DFLASHMODE_DOUT -DLWIP_OPEN_SRC -DNONOSDK22x_190703=1 -DTCP_MSS=1460 -DLWIP_FEATURES=0 -DLWIP_IPV6=0 -DVTABLES_IN_FLASH -DMMU_IRAM_SIZE=0x8000 -DMMU_ICACHE_SIZE=0x8000 -Isrc -I/data/light-41-2/.piolibdeps/light-41-2/ArduinoJson/src -I/data/light-41-2/.piolibdeps/light-41-2/noise-c/include -I/data/light-41-2/.piolibdeps/light-41-2/noise-c/src -I/data/light-41-2/.piolibdeps/light-41-2/libsodium/libsodium/src/libsodium/include -I/data/light-41-2/.piolibdeps/light-41-2/libsodium/libsodium/src/libsodium -I/data/light-41-2/.piolibdeps/light-41-2/libsodium/libsodium/src/libsodium/include/sodium -I/data/light-41-2/.piolibdeps/light-41-2/libsodium/port_include -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/ESP8266mDNS/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/DNSServer/src -I/data/light-41-2/.piolibdeps/light-41-2/ESPAsyncWebServer-esphome/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/ESP8266WiFi/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/Hash/src -I/data/light-41-2/.piolibdeps/light-41-2/ESPAsyncTCP-esphome/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/tools/sdk/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/cores/esp8266 -I/data/cache/platformio/packages/toolchain-xtensa/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/tools/sdk/lwip2/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/variants/generic src/main.cpp

If I remove all the inputs (-I), the output (-o) and the source file itself (src/main.cpp), and split out each arg on a newline, I have the following. Winmerge tells me the two sets are identical:

## light-41-1
xtensa-lx106-elf-g++
  -c
  -fno-rtti
  -std=gnu++17
  -fno-exceptions
  -Wno-nonnull-compare
  -Wno-sign-compare
  -Wno-unused-but-set-variable
  -Wno-unused-variable
  -fno-exceptions
  -Os
  -mlongcalls
  -mtext-section-literals
  -falign-functions=4
  -U__STRICT_ANSI__
  -D_GNU_SOURCE
  -ffunction-sections
  -fdata-sections
  -Wall
  -Werror=return-type
  -free
  -fipa-pta
  -DPLATFORMIO=60106
  -DESP8266
  -DARDUINO_ARCH_ESP8266
  -DARDUINO_ESP8266_ESP01
  -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO
  -DNEW_OOM_ABORT
  -DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH
  -DUSE_ARDUINO
  -DUSE_ESP8266
  -DUSE_ESP8266_FRAMEWORK_ARDUINO
  -DUSE_STORE_LOG_STR_IN_FLASH
  -DF_CPU=80000000L
  -D__ets__
  -DICACHE_FLASH
  -DARDUINO=10805
  -DARDUINO_BOARD=\"PLATFORMIO_ESP8285\"
  -DFLASHMODE_DOUT
  -DLWIP_OPEN_SRC
  -DNONOSDK22x_190703=1
  -DTCP_MSS=1460
  -DLWIP_FEATURES=0
  -DLWIP_IPV6=0
  -DVTABLES_IN_FLASH
  -DMMU_IRAM_SIZE=0x8000
  -DMMU_ICACHE_SIZE=0x8000

## light-41-2
xtensa-lx106-elf-g++
  -c
  -fno-rtti
  -std=gnu++17
  -fno-exceptions
  -Wno-nonnull-compare
  -Wno-sign-compare
  -Wno-unused-but-set-variable
  -Wno-unused-variable
  -fno-exceptions
  -Os
  -mlongcalls
  -mtext-section-literals
  -falign-functions=4
  -U__STRICT_ANSI__
  -D_GNU_SOURCE
  -ffunction-sections
  -fdata-sections
  -Wall
  -Werror=return-type
  -free
  -fipa-pta
  -DPLATFORMIO=60106
  -DESP8266
  -DARDUINO_ARCH_ESP8266
  -DARDUINO_ESP8266_ESP01
  -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO
  -DNEW_OOM_ABORT
  -DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH
  -DUSE_ARDUINO
  -DUSE_ESP8266
  -DUSE_ESP8266_FRAMEWORK_ARDUINO
  -DUSE_STORE_LOG_STR_IN_FLASH
  -DF_CPU=80000000L
  -D__ets__
  -DICACHE_FLASH
  -DARDUINO=10805
  -DARDUINO_BOARD=\"PLATFORMIO_ESP8285\"
  -DFLASHMODE_DOUT
  -DLWIP_OPEN_SRC
  -DNONOSDK22x_190703=1
  -DTCP_MSS=1460
  -DLWIP_FEATURES=0
  -DLWIP_IPV6=0
  -DVTABLES_IN_FLASH
  -DMMU_IRAM_SIZE=0x8000
  -DMMU_ICACHE_SIZE=0x8000

Other files are compiled with similar (long) commandlines. Example for src/esphome/components/api/api_pb2_service.cpp:

xtensa-lx106-elf-g++ -o /data/light-41-1/.pioenvs/light-41-1/src/esphome/components/api/api_pb2_service.cpp.o -c -fno-rtti -std=gnu++17 -fno-exceptions -Wno-nonnull-compare -Wno-sign-compare -Wno-unused-but-set-variable -Wno-unused-variable -fno-exceptions -Os -mlongcalls -mtext-section-literals -falign-functions=4 -U__STRICT_ANSI__ -D_GNU_SOURCE -ffunction-sections -fdata-sections -Wall -Werror=return-type -free -fipa-pta -DPLATFORMIO=60106 -DESP8266 -DARDUINO_ARCH_ESP8266 -DARDUINO_ESP8266_ESP01 -DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO -DNEW_OOM_ABORT -DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH -DUSE_ARDUINO -DUSE_ESP8266 -DUSE_ESP8266_FRAMEWORK_ARDUINO -DUSE_STORE_LOG_STR_IN_FLASH -DF_CPU=80000000L -D__ets__ -DICACHE_FLASH -DARDUINO=10805 -DARDUINO_BOARD=\"PLATFORMIO_ESP8285\" -DFLASHMODE_DOUT -DLWIP_OPEN_SRC -DNONOSDK22x_190703=1 -DTCP_MSS=1460 -DLWIP_FEATURES=0 -DLWIP_IPV6=0 -DVTABLES_IN_FLASH -DMMU_IRAM_SIZE=0x8000 -DMMU_ICACHE_SIZE=0x8000 -Isrc -I/data/light-41-1/.piolibdeps/light-41-1/ArduinoJson/src -I/data/light-41-1/.piolibdeps/light-41-1/noise-c/include -I/data/light-41-1/.piolibdeps/light-41-1/noise-c/src -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/libsodium/src/libsodium/include -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/libsodium/src/libsodium -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/libsodium/src/libsodium/include/sodium -I/data/light-41-1/.piolibdeps/light-41-1/libsodium/port_include -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/ESP8266mDNS/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/DNSServer/src -I/data/light-41-1/.piolibdeps/light-41-1/ESPAsyncWebServer-esphome/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/ESP8266WiFi/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/libraries/Hash/src -I/data/light-41-1/.piolibdeps/light-41-1/ESPAsyncTCP-esphome/src -I/data/cache/platformio/packages/framework-arduinoespressif8266/tools/sdk/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/cores/esp8266 -I/data/cache/platformio/packages/toolchain-xtensa/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/tools/sdk/lwip2/include -I/data/cache/platformio/packages/framework-arduinoespressif8266/variants/generic src/esphome/components/api/api_pb2_service.cpp

@LordMike
Copy link
Author

LordMike commented Mar 18, 2023

I've tried modifying what I think is the code that does the hashing, to see what the contents of the config are.

def compute_project_checksum(config):
# rebuild when PIO Core version changes
checksum = sha1(hashlib_encode_data(__version__))
# configuration file state
checksum.update(hashlib_encode_data(config.to_json()))
# project file structure
check_suffixes = (".c", ".cc", ".cpp", ".h", ".hpp", ".s", ".S")

I've put them below. I see multiple path-specific items, not just the env:light-41-1. Apparently the build_dir and libdeps_dir are also relevant. I'm not sure why, as surely the build flags and the contents of the source code (headers and code) are enough to determine if the output will be identical. The actual location of the files must be irrelevant..

Something weird is going on though, because esphome has chosen to "hack" out this bit of platformio: https://github.com/esphome/esphome/blob/b184b01600c9cb20f8d130dbac270809ac706c24/esphome/platformio_api.py#L18-L41. My code printing this config didn't actually run until I commented out that bit in esphome .. which then means that this isn't actually even running in my situation. It feels like this function is a red herring.

[
  [
    "common",
    [
      [
        "lib_deps",
        ""
      ],
      [
        "build_flags",
        ""
      ],
      [
        "upload_flags",
        ""
      ]
    ]
  ],
  [
    "platformio",
    [
      [
        "description",
        "ESPHome 2023.3.0"
      ],
      [
        "globallib_dir",
        "/piolibs"
      ],
      [
        "platforms_dir",
        "/data/cache/platformio/platforms"
      ],
      [
        "packages_dir",
        "/data/cache/platformio/packages"
      ],
      [
        "cache_dir",
        "/data/cache/platformio/cache"
      ],
      [
        "build_dir",
        "/data/light-41-1/.pioenvs"
      ],
      [
        "libdeps_dir",
        "/data/light-41-1/.piolibdeps"
      ]
    ]
  ],
  [
    "env:light-41-1",
    [
      [
        "board",
        "esp8285"
      ],
      [
        "board_build.flash_mode",
        "dout"
      ],
      [
        "board_build.ldscript",
        "eagle.flash.1m.ld"
      ],
      [
        "build_flags",
        [
          "-DESPHOME_LOG_LEVEL=ESPHOME_LOG_LEVEL_INFO",
          "-DNEW_OOM_ABORT",
          "-DPIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH_LOW_FLASH",
          "-DUSE_ARDUINO",
          "-DUSE_ESP8266",
          "-DUSE_ESP8266_FRAMEWORK_ARDUINO",
          "-DUSE_STORE_LOG_STR_IN_FLASH",
          "-Wno-nonnull-compare",
          "-Wno-sign-compare",
          "-Wno-unused-but-set-variable",
          "-Wno-unused-variable",
          "-fno-exceptions"
        ]
      ],
      [
        "extra_scripts",
        [
          "post:post_build.py"
        ]
      ],
      [
        "framework",
        [
          "arduino"
        ]
      ],
      [
        "lib_deps",
        [
          "ottowinter/ESPAsyncTCP-esphome@1.2.3",
          "esphome/ESPAsyncWebServer-esphome@2.1.0",
          "DNSServer",
          "ESP8266WiFi",
          "ESP8266mDNS",
          "esphome/noise-c@0.1.4",
          "bblanchon/ArduinoJson@6.18.5"
        ]
      ],
      [
        "lib_ldf_mode",
        "off"
      ],
      [
        "platform",
        "platformio/espressif8266 @ 3.2.0"
      ],
      [
        "platform_packages",
        [
          "platformio/framework-arduinoespressif8266 @ ~3.30002.0"
        ]
      ]
    ]
  ]
]

@LordMike
Copy link
Author

So the hashing code above is obviously a red herring, since it doesn't actually run in my case, but if I set build_cache_dir, and recompile a single device twice (clearing output in between), it uses the cache for the files it already compiled.

Retrieved `/data/light-41-1/.pioenvs/light-41-1/src/esphome/components/api/api_connection.cpp.o' from cache
Retrieved `/data/light-41-1/.pioenvs/light-41-1/src/esphome/components/api/api_frame_helper.cpp.o' from cache
Retrieved `/data/light-41-1/.pioenvs/light-41-1/src/esphome/components/api/api_pb2.cpp.o' from cache
Retrieved `/data/light-41-1/.pioenvs/light-41-1/src/esphome/components/api/api_pb2_service.cpp.o' from cache
...

I followed the log message above to SCons:

And here I fear that sigs.append(self.get_internal_path()) might actually be the actual file path on disk. This would mean that you could never reuse a build output between two identical source files.. The first point hints that .get_internal_path() is actually the full path, as seen in the log output I have. ... :'(

@ivankravets
Copy link
Member

Sorry, we can't help you so much with the 3rd party software. If you still experience any issues with PlatformIO Core, please create clean, simple, and independent PlatformIO-based project to reproduce this issue.

@mcspr
Copy link
Contributor

mcspr commented Mar 20, 2023

I believe the gist of the issue is still the same as my previous ones - how do we build projects with multiple environments fast. ESPHome, ESPurna, etc. build process are already as minimal as we can get the example?

Libraries are built separately, framework files are built separately; all while build flags stay the exact same, it becomes not very effective way to spend CPU resources to rebuild everything.

@ivankravets
Copy link
Member

@mcspr , could you reproduce this issue with the bare PlatformIO projects?

@mcspr
Copy link
Contributor

mcspr commented Mar 20, 2023

@ivankravets https://github.com/mcspr/pio4574

So, there are two identical envs - 'a' and 'b'. Doing some clean-up and building 'a' first

> rm -rf .build_cache/
> pio run -s -t clean
> pio run -e a | grep 'from cache' | grep libFrameworkArduino.a
> pio run -s -t clean
> pio run -e a | grep 'from cache' | grep libFrameworkArduino.a
Retrieved `.pio/build/a/libFrameworkArduino.a' from cache

Cache picked up our framework file, everything good. Building 'b' next

> pio run -e b | grep 'from cache' | grep libFrameworkArduino.a

Can't reuse already built file from 'a' per the description above, even when they are identical. 2nd run of 'b' works with the cache related to 'b' only

> pio run -s -e b -t clean
> pio run -e b | grep 'from cache' | grep libFrameworkArduino.a
Retrieved `.pio/build/b/libFrameworkArduino.a' from cache

@ivankravets
Copy link
Member

Thanks for the report. This is a bug.

Please re-test with pio upgrade --dev.

@mcspr
Copy link
Contributor

mcspr commented Mar 20, 2023

Still does not retrieve cached .a from one env to the other

Because of env separation, CacheDir retrieval uses build-dir target paths

...
CacheRetrieve(.pio/build/a/libFrameworkArduino.a): retrieving from /home/runner/dev/pio4574/.build_cache/1A/1a8f427486d8fceb4d313ff8ca8f9f2e
...

Which in turn depends on the env name, which we wanted to avoid
(e.g. have some kind of stable path .pio/build/.framework-build-flags-hash-foo/file.ext)

@ivankravets
Copy link
Member

Could it be the limitation of SCons? I think SCons does not cache library archives.

cc: @bdbaddog, @mwichmann

@LordMike
Copy link
Author

I found earlier that SCons internally likely uses the full path name to the file it is working on. This means that any compilation will be unique for different environments by default :/

Maybe stuff can be done with relative paths as suggested.

@bdbaddog
Copy link

SCons (unlike many other build tools) can cache all targets it can build. Not just ones created by compilers/linkers.

SCons uses the full command line to generate any target (along with content hashes of the sources, including implicit sources (potentially compiler binary, and included header files, libraries, etc) to generate the hash used to name the file in the cache.

So yes a full path in the target would cause the cache to be less useful.

You can exclude parts of the command line from the hash calculation, but I don't think you could exclude part of the path of the target for an absolute path.

BTW SCons (unlike many other build tools) can cache all targets it can build. Not just ones created by compilers/linkers.

@mcspr
Copy link
Contributor

mcspr commented Mar 22, 2023

I found earlier that SCons internally likely uses the full path name to the file it is working on. This means that any compilation will be unique for different environments by default :/

Maybe stuff can be done with relative paths as suggested.

Hacking around the signature generator and removing BUILD_DIR prefix seems to work for the (updated) example above.
mcspr/scons@a2d6a1b99

based on the #4574 (comment) tracing of where it comes from. I guess whether it is relative or absolutet depends on how PIO invokes scons, the way I build it always comes up relative to project root.

@bdbaddog
Copy link

BUILD_DIR is not a SCons provided envvar. So that patch wouldn't work in SCons. However there is a get_relpath() method on Node's which you might work...

@bdbaddog
Copy link

where is the offending directory located relative to the SConstruct? In a sub dir, a sibling dir, or up a few and down a few dirs?

@ivankravets
Copy link
Member

@mcspr, do you see any issues in the PlatformIO Core? Maybe, it is our problem? From what I remember, we always use relative paths to the project directory.

@mcspr
Copy link
Contributor

mcspr commented Mar 22, 2023

do you see any issues in the PlatformIO Core? Maybe, it is our problem? From what I remember, we always use relative paths to the project directory.

The gist of the issue (as I understood it) - cache for certain build output depends on path. Since we have multiple environments, BUILD_DIR is different, we can't share cache entries between them. Relative, absolute, etc. does not really matter, just the fact that BUILD_DIR is there.

What I tried above is to simply remove the BUILD DIR from the signature calc for the node. When building same thing, relative paths inside build directory cause cache to be shared.

@mwichmann
Copy link

If this is the SCons CacheDir cache (sorry to not know what PlatformIO is using), as @bdbaddog has explained, the name of the file written to the cache directory is computed from several pieces of data, only one of which is the path to the target file; that is sure to result in a unique hash (the hash is used as the filename). If it didn't, then SCons would already have complained about multiple definitions of the same target and aborted. When you later redo the calculation, you either get a "hit" - that object is in the cache, or you don't (it does mean the cache can grow sneakily). If this isn't the SCons cache, please ignore this comment entirely.

@ivankravets ivankravets reopened this Mar 22, 2023
@ivankravets
Copy link
Member

@bdbaddog, @mwichmann, thanks again for being willing to help us!

I've just created a small project (scons-caching.zip) to reproduce this issue. If you try to build the first time, you will see that for both variants "hello.c" is built.

What do PlatformIO users expect, that hello.c -> hello.o will be called just 1 time and later will be reused from the cache. Currently, the "cache" mechanism is linked to the variant directory. PlatformIO uses a unique variant directory per the "working environment" (what developers mentioned above). They have 2 working environments (on the SCons language, the same source files but different variant dirs), and they want to share the cache between them.

Do you have any ideas on how to solve it?

@ivankravets
Copy link
Member

I forgot about the log for the project which I provided:

  1. Build the project. The issue here is that src/hello.c was built twice instead of just for the first time for build1.
$ scons.py
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
gcc -o build1/hello.o -c src/hello.c
gcc -o build1/hello build1/hello.o
gcc -o build2/hello.o -c src/hello.c
gcc -o build2/hello build2/hello.o
scons: done building targets.
  1. Try to build the project again. Everything is good this time.
$ scons.py
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: `.' is up to date.
scons: done building targets.
  1. Remove the "build*" folders. The cache is retrieved, which is good.
$ scons.py
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
Retrieved `build1/hello.o' from cache
Retrieved `build1/hello' from cache
Retrieved `build2/hello.o' from cache
Retrieved `build2/hello' from cache
scons: done building targets.

What do we expect when the project is fully cleaned? Something like this:

$ scons.py
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
gcc -o build1/hello.o -c src/hello.c
gcc -o build1/hello build1/hello.o
Retrieved `build2/hello.o' from a cache ----> here
gcc -o build2/hello build2/hello.o
scons: done building targets.

I understand that "build2" has not been built yet. But build1/hello.o is the same as the upcoming "build2/hello.o`. The same source, the same environment.

@bdbaddog
Copy link

Haven't had a chance to look at the repro, but just a quick thought can your build use duplicate=False in your variant dirs?

@ivankravets
Copy link
Member

@bdbaddog, yes, see the source code of SConstruct:

CacheDir('build_cache')
VariantDir('build1', 'src', duplicate=False)
VariantDir('build2', 'src', duplicate=False)
env1 = Environment()
env1.Program('build1/hello.c')
env2 = Environment()
env2.Program('build2/hello.c')

@mcspr
Copy link
Contributor

mcspr commented Mar 27, 2023

The best I could come up with is to just have a custom env var CACHEDIR_NODE_DIR and pass it along to the signature func as dir when it exists instead of hard-coded internal path. Updates the commit up above and even has a small test case.

SCons/scons@master...mcspr:pio4574
develop...mcspr:pio4574

Pretty annoying to manage combined with VariantDir, but seems to work.
(but, I don't speak SCons fluently, so idk if this is even worth it vs. just using the node name as-is instead of path; that would definitely be shorter)

@ivankravets ivankravets modified the milestones: 6.1.7, Backlog Mar 28, 2023
@bdbaddog
Copy link

@bdbaddog, yes, see the source code of SConstruct:

CacheDir('build_cache')
VariantDir('build1', 'src', duplicate=False)
VariantDir('build2', 'src', duplicate=False)
env1 = Environment()
env1.Program('build1/hello.c')
env2 = Environment()
env2.Program('build2/hello.c')

So this works fine unless the target with path is included in the generated file by the builder.
Please file an enhancement request in SCons?
As I understand it you'd like for two identical build commands excluding different target locations to be able to use the same cached built file. Is that correct?

@ivankravets
Copy link
Member

As I understand it you'd like for two identical build commands excluding different target locations to be able to use the same cached built file. Is that correct?

Yes, correct. Let me give you a few real examples. C/C++ developers use 3-rd party libraries/frameworks in the projects. These libraries/frameworks are common per project. Following the example above, each project will have its own VariantDir but COMMON CacheDir.

Now, when you build a few projects on the same host machine (or CI) developers expect that common libraries/frameworks will not be recompiled for each project. This is actually the context of this issue #4574.

If we use common CacheDir, we expect that SCons will pull pre-built artifacts from the cache storage instead of relying on VariantDir which should have LESS priority.

@ivankravets
Copy link
Member

Please file an enhancement request in SCons?

@mcspr, could I ask you to file a feature request at https://github.com/SCons/scons/issues and point to this issue? You already reviewed the SCons sources and have a better understanding of this problem.

@mwichmann
Copy link

In this example, you can place the built tree (say of boost - I know that's a common one) in what SCons calls a "repository", then various builds in SCons will populate from that as long as they specify that they use a repository. It's kind of like a cache at the opposite end, if that makes sense.

@bdbaddog
Copy link

This is unlikely to be a general change, but rather one specified per builder, or enabled per user, because as I've said above some tools embed the target file/location/other related info in the generated file.
But in many cases (including yours) it makes sense to enable at least for some builders..

@ivankravets
Copy link
Member

@mwichmann, thanks for the hint with env.Repository(). I've just tried it and it didn't work for us. As I understand, the goal of "Repository" is to keep artifacts/objects directly in the "Repository" folder. Only in this case, the caching might work.

The issue with we faced is to use VariantDir + CacheDir together. Now, it is not clear to me what CacheDir contains. Does it contain only the references to the objects which are stored in the VariantDir or Repostiry, or it contains the complete artifacts/objects of the compiled source files?

@mwichmann
Copy link

the actual objects are copied to the cache as they're built, and copied back when needed.

@bdbaddog
Copy link

Had a thought. Not sure if it would resolve this, but can one of you give it a try?
So the issue ( I think) is that the hash of the command line is different and that contributes to the build hash used to name the file in the cache.

So if the -o $TARGET was changed to $( -o $TARGET $) where $( and $) exclude items on the command line from that action's signature, it's possible many builds from the same source with the same flags but different output file names/locations could map to the same cachedir file.

Does that make sense?

@mcspr
Copy link
Contributor

mcspr commented Mar 31, 2023

Right. But, looking at the code path cachedir sig takes, children and Nodes own contents sig match up until it appends path of the node that differs.
https://github.com/SCons/scons/blob/c80cbb0846c5d9265f356e86c4e79f1d13e3e8a8/SCons/Node/FS.py#L3693-L3697
https://github.com/SCons/scons/blob/c80cbb0846c5d9265f356e86c4e79f1d13e3e8a8/SCons/Node/FS.py#L3713

Only using the Nodes 'children + own' signature instead of 'children + own + path' would definitely fix the issue, 'own' signature is enough to distinguish targets
(where patch above was trying to preserve existing behaviour and optionally normalize node path based on some known variable, pretending that we were Chdir()ed into build directory)

@bdbaddog
Copy link

You are correct, ignoring the target from the build signature won't resolve it.

The problem is ignoring the target file and it's path is not a good change for all builders.

For your usage (any many) this may be acceptable for compile and link builders.
So likely this should be a per builder (and/or per target node) option, defaulting to including the target path.

Please file an enhancement request issue in SCons' github repo.

@mcspr
Copy link
Contributor

mcspr commented Mar 31, 2023

So likely this should be a per builder (and/or per target node) option, defaulting to including the target path.

If I understood the idea correctly...
SCons/scons@master...mcspr:scons:pio4574v2

def factory func or scanner somewhere on PIO side?(node, *arg, **kwarg):
    x = env.File(node)
    x.set_cachedir_bsig_path('')
    return x

Please file an enhancement request issue in SCons' github repo.

Will do

@expaso
Copy link

expaso commented Apr 16, 2023

I was wondering about this behaviour for a few years, and by coincidence I've decided to digg this out myself today.
And voila! This discussion here couldn't be more actual :)

On topic: ESPHome auto-generates the source directories by itself. What would happen if we would make it generate symlinks for all the library code and the components? Wouldnt these sourcefiles then be seen as identical paths on disk, thus hitting the cache?

@LordMike
Copy link
Author

Sounds like an approach like https://pnpm.io/ is using. NPM has the same issue, just way worse: If APP dependes on A and B, and A also depends on B, npm creates a folder structure that includes B twice. This does not scale at all. Pnpm fixes this with a hierarchy of symlinks.

To make that work here, the links path needs to be resolved - I didn't spot any "readlink"-like approach when I was looking through code - but I might've missed it.

@mwichmann
Copy link

It's not an excuse, just an explanation: because of the Windows symlink situation (previously needed elevated privilege, now needing "developer mode" to be set, neither of which SCons itself can control), SCons has stayed away from depending heavily on symlinks. There are cases, like variant dirs, where symlinking is done and there's a policy difference between platforms on whether to attempt, but that's relatively rare.

@expaso
Copy link

expaso commented Apr 22, 2023

I think it's generally a good idea to stay away from relying on symlinks, thats for sure.

But in this case, if the source-file happens to be a symlink (because that's simply the was the source-folders were organized), maybe SCons could follow the symlink and store the real location of the file as part of the cache-key instead.

What other options do we have left?
2 separate files, on separate physical locations, splitted by 2 environments. This can only be resolved by hashing the file contents and use the hash as a key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants