Closed
Description
Build and linking, using zig only:
./zig-x86_64-relsafe-espressif-linux-musl-baseline/zig build-exe start.zig -target xtensa-freestanding-none -mcpu=<cpuname>
- esp32
- esp32s3
- cnl (intel cannonlake variant for xtensa)
Skip error: add -fno-compiler-rt
LLVM Emit Object... LLVM ERROR: Cannot select: 0x7379c54b5200: f32 = fp16_to_fp 0x7379c5783cc0, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
0x7379c5783cc0: i32 = or 0x7379c578d240, 0x7379c578d390, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
0x7379c578d240: i32 = AssertZext 0x7379c54a17c0, ValueType:ch:i16, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
0x7379c54a17c0: i32,ch = CopyFromReg 0x7379c5d20110, Register:i32 %1, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
0x7379c54b60c0: i32 = Register %1
0x7379c578d390: i32 = XtensaISD::PCREL_WRAPPER TargetConstantPool:i32<i32 31744> 0
0x7379c5786680: i32 = TargetConstantPool<i32 31744> 0
In function: __floatsihf
[1] 3295 IOT instruction (core dumped) ./zig-x86_64-relsafe-espressif-linux-musl-baseline/zig build-exe start.zig
zig-espressif-bootstrap/zig/lib/compiler_rt/float_from_int.zig
Lines 44 to 46 in 8d67986
zig-espressif-bootstrap/zig/lib/compiler_rt/floatdihf.zig
Lines 10 to 12 in 8d67986
- esp32s2 works!!
time report
===-------------------------------------------------------------------------===
Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
Total Execution Time: 0.0000 seconds (0.0000 wall clock)
---User Time--- --User+System-- ---Wall Time--- --- Name ---
0.0000 ( 34.8%) 0.0000 ( 34.8%) 0.0000 ( 34.4%) Instruction Selection
0.0000 ( 15.2%) 0.0000 ( 15.2%) 0.0000 ( 15.4%) Instruction Creation
0.0000 ( 13.0%) 0.0000 ( 13.0%) 0.0000 ( 13.3%) Instruction Scheduling
0.0000 ( 10.9%) 0.0000 ( 10.9%) 0.0000 ( 11.3%) Vector Legalization
0.0000 ( 10.9%) 0.0000 ( 10.9%) 0.0000 ( 10.8%) DAG Combining 1
0.0000 ( 4.3%) 0.0000 ( 4.3%) 0.0000 ( 4.6%) DAG Legalization
0.0000 ( 4.3%) 0.0000 ( 4.3%) 0.0000 ( 4.1%) Type Legalization
0.0000 ( 4.3%) 0.0000 ( 4.3%) 0.0000 ( 4.1%) DAG Combining 2
0.0000 ( 2.2%) 0.0000 ( 2.2%) 0.0000 ( 2.1%) Instruction Scheduling Cleanup
0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) Total
===-------------------------------------------------------------------------===
Pass execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 0.0004 seconds (0.0004 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0002 ( 48.3%) 0.0000 ( 0.0%) 0.0002 ( 45.2%) 0.0002 ( 44.6%) Xtensa DAG->DAG Pattern Instruction Selection
0.0001 ( 15.9%) 0.0000 ( 0.0%) 0.0001 ( 14.9%) 0.0001 ( 14.3%) Xtensa Assembly Printer
0.0000 ( 7.8%) 0.0000 ( 0.0%) 0.0000 ( 7.3%) 0.0000 ( 7.0%) Live DEBUG_VALUE analysis
0.0000 ( 3.6%) 0.0000 ( 0.0%) 0.0000 ( 3.4%) 0.0000 ( 3.3%) Prologue/Epilogue Insertion & Frame Finalization
0.0000 ( 3.0%) 0.0000 ( 0.0%) 0.0000 ( 2.8%) 0.0000 ( 2.8%) MachineDominator Tree Construction
0.0000 ( 0.0%) 0.0000 ( 13.0%) 0.0000 ( 0.8%) 0.0000 ( 1.4%) Lower constant intrinsics
0.0000 ( 1.2%) 0.0000 ( 0.0%) 0.0000 ( 1.1%) 0.0000 ( 1.1%) Fast Register Allocator
0.0000 ( 0.9%) 0.0000 ( 0.0%) 0.0000 ( 0.8%) 0.0000 ( 0.8%) Free MachineFunction
0.0000 ( 0.9%) 0.0000 ( 0.0%) 0.0000 ( 0.8%) 0.0000 ( 0.8%) Machine Natural Loop Construction
0.0000 ( 0.0%) 0.0000 ( 8.7%) 0.0000 ( 0.6%) 0.0000 ( 0.8%) Remove unreachable blocks from the CFG #2
0.0000 ( 0.9%) 0.0000 ( 0.0%) 0.0000 ( 0.8%) 0.0000 ( 0.8%) Stack Frame Layout Analysis
0.0000 ( 0.9%) 0.0000 ( 0.0%) 0.0000 ( 0.8%) 0.0000 ( 0.8%) Finalize ISel and expand pseudo-instructions
0.0000 ( 0.9%) 0.0000 ( 0.0%) 0.0000 ( 0.8%) 0.0000 ( 0.8%) Xtensa instruction size reduction pass
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.6%) Two-Address instruction pass
0.0000 ( 0.3%) 0.0000 ( 4.3%) 0.0000 ( 0.6%) 0.0000 ( 0.6%) Safe Stack instrumentation pass
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.6%) Insert stack protectors
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.6%) Machine Natural Loop Construction #2
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.6%) Branch relaxation pass
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.6%) Scalarize Masked Memory Intrinsics
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.6%) Fixup Statepoint Caller Saved
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.6%) Expand vector predication intrinsics
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.6%) StackMap Liveness Analysis
0.0000 ( 0.3%) 0.0000 ( 8.7%) 0.0000 ( 0.8%) 0.0000 ( 0.6%) Shadow Stack GC Lowering
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.6%) MachineDominator Tree Construction #2
0.0000 ( 0.0%) 0.0000 ( 8.7%) 0.0000 ( 0.6%) 0.0000 ( 0.6%) Pre-ISel Intrinsic Lowering
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Eliminate PHI nodes for register allocation
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.5%) Expand large div/rem
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Machine Optimization Remark Emitter
0.0000 ( 0.3%) 0.0000 ( 4.3%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Expand large fp convert
0.0000 ( 0.3%) 0.0000 ( 4.3%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Expand Atomic instructions
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Analyze Machine Code For Garbage Collection
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Insert fentry calls
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.5%) Insert XRay ops
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Implement the 'patchable-function' attribute
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Xtensa bool reg fixup pass
0.0000 ( 0.3%) 0.0000 ( 4.3%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Assignment Tracking Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.5%) Lower invoke and unwind, for unwindless code generators
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Machine Optimization Remark Emitter #2
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Lazy Machine Block Frequency Analysis #2
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Contiguously Lay Out Funclets
0.0000 ( 0.0%) 0.0000 ( 13.0%) 0.0000 ( 0.8%) 0.0000 ( 0.5%) Remove unreachable blocks from the CFG
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.5%) Lower Garbage Collection Instructions
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.5%) Prepare callbr
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.5%) Xtensa Hardware Loop Fixup
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Assumption Cache Tracker
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Local Stack Slot Allocation
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Xtensa fix PSRAM cache issue in the ESP32 chips
0.0000 ( 0.0%) 0.0000 ( 4.3%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Expand reduction intrinsics
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Xtensa Hardware Loops
0.0000 ( 0.6%) 0.0000 ( 0.0%) 0.0000 ( 0.6%) 0.0000 ( 0.3%) Post-RA pseudo instruction expansion pass
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Lazy Machine Block Frequency Analysis
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Create Garbage Collector Module Metadata
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Target Library Information
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Target Transform Information
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Remove Redundant DEBUG_VALUE analysis
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Profile summary info
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.3%) Machine Sanitizer Binary Metadata
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.2%) Xtensa Constant Islands
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.2%) Machine Branch Probability Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Target Pass Configuration
0.0000 ( 0.3%) 0.0000 ( 0.0%) 0.0000 ( 0.3%) 0.0000 ( 0.0%) Machine Module Information
0.0003 (100.0%) 0.0000 (100.0%) 0.0004 (100.0%) 0.0004 (100.0%) Total
===-------------------------------------------------------------------------===
DWARF Emission
===-------------------------------------------------------------------------===
Total Execution Time: 0.0002 seconds (0.0002 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0001 (100.0%) 0.0001 (100.0%) 0.0002 (100.0%) 0.0002 (100.0%) Debug Info Emission
0.0001 (100.0%) 0.0001 (100.0%) 0.0002 (100.0%) 0.0002 (100.0%) Total
===-------------------------------------------------------------------------===
Analysis execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 0.0000 seconds (0.0000 wall clock)
--System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0000 ( 60.0%) 0.0000 ( 60.0%) 0.0000 ( 61.9%) TargetLibraryAnalysis
0.0000 ( 20.0%) 0.0000 ( 20.0%) 0.0000 ( 19.0%) InnerAnalysisManagerProxy<FunctionAnalysisManager, Module>
0.0000 ( 20.0%) 0.0000 ( 20.0%) 0.0000 ( 19.0%) ProfileSummaryAnalysis
0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) Total
===-------------------------------------------------------------------------===
Pass execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 0.0000 seconds (0.0000 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0000 ( 20.0%) 0.0000 ( 34.8%) 0.0000 ( 32.1%) 0.0000 ( 38.5%) AlwaysInlinerPass
0.0000 ( 40.0%) 0.0000 ( 34.8%) 0.0000 ( 35.7%) 0.0000 ( 34.4%) AnnotationRemarksPass
0.0000 ( 40.0%) 0.0000 ( 30.4%) 0.0000 ( 32.1%) 0.0000 ( 27.0%) CoroConditionalWrapper
0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) Total
===-------------------------------------------------------------------------===
Pass execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 0.0000 seconds (0.0000 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0000 ( 20.0%) 0.0000 ( 34.8%) 0.0000 ( 32.1%) 0.0000 ( 38.5%) AlwaysInlinerPass
0.0000 ( 40.0%) 0.0000 ( 34.8%) 0.0000 ( 35.7%) 0.0000 ( 34.4%) AnnotationRemarksPass
0.0000 ( 40.0%) 0.0000 ( 30.4%) 0.0000 ( 32.1%) 0.0000 ( 27.0%) CoroConditionalWrapper
0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) Total
===-------------------------------------------------------------------------===
Analysis execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 0.0000 seconds (0.0000 wall clock)
--System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0000 ( 60.0%) 0.0000 ( 60.0%) 0.0000 ( 61.9%) TargetLibraryAnalysis
0.0000 ( 20.0%) 0.0000 ( 20.0%) 0.0000 ( 19.0%) InnerAnalysisManagerProxy<FunctionAnalysisManager, Module>
0.0000 ( 20.0%) 0.0000 ( 20.0%) 0.0000 ( 19.0%) ProfileSummaryAnalysis
0.0000 (100.0%) 0.0000 (100.0%) 0.0000 (100.0%) Total
Reference
- https://gist.github.com/kassane/7bdb782a1984d0c6581ae7b44e1fc0c2?permalink_comment_id=4919689#gistcomment-4919689
- espressif/llvm-project@bc0d57e
- FFT HW Acceleration esp-rs/rust#190
- Optimize floating point arithmetic elkel53930/micromouse-std-esp32#1
- FPU notes on zig kubo39/esp32-resource#1
- https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/tools/ip/tensilica-ip/isa-summary.pdf
- https://dl.espressif.com/github_assets/espressif/xtensa-isa-doc/releases/download/latest/Xtensa.pdf
- No, the ESP32-S2 is not faster at floating point