Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

float32 otherlib #2492

Merged
merged 180 commits into from
May 9, 2024
Merged
Show file tree
Hide file tree
Changes from 172 commits
Commits
Show all changes
180 commits
Select commit Hold shift + click to select a range
430b6b7
float32 in flambda2
TheNumbat Mar 14, 2024
627203b
fixes
TheNumbat Mar 14, 2024
296e7b8
arm
TheNumbat Mar 14, 2024
a0a2282
more crs
TheNumbat Mar 14, 2024
7458491
primitives
TheNumbat Mar 14, 2024
8644e75
format
TheNumbat Mar 14, 2024
d0342f1
upstream build
TheNumbat Mar 14, 2024
797a7a6
delete array case
TheNumbat Mar 19, 2024
fad1816
casts
TheNumbat Mar 19, 2024
0bd3e82
format
TheNumbat Mar 19, 2024
f195d02
Merge branch 'f32-fl2' into f32-backend
TheNumbat Mar 19, 2024
9c8d442
testsuite cmm parser
TheNumbat Mar 19, 2024
a2af96d
format
TheNumbat Mar 19, 2024
9451a8f
cast fl2
TheNumbat Mar 19, 2024
84e23ff
num conv
TheNumbat Mar 19, 2024
39b36ca
merge
TheNumbat Mar 19, 2024
e42799a
fix name in cfg
TheNumbat Mar 19, 2024
0b86abd
float32 static consts
TheNumbat Mar 19, 2024
2fe6ea4
add test
TheNumbat Mar 19, 2024
7e93f52
fixes
TheNumbat Mar 19, 2024
8d56965
Merge branch 'f32-fl2' into f32-backend
TheNumbat Mar 19, 2024
8ade880
another fix
TheNumbat Mar 19, 2024
7f64da8
Merge branch 'f32-fl2' into f32-backend
TheNumbat Mar 19, 2024
0d24bdd
fix nan test
TheNumbat Mar 19, 2024
0937fc4
actually use cconst_float32
TheNumbat Mar 20, 2024
e7f7b9e
float32 operations
TheNumbat Mar 21, 2024
28c4360
cmm stubs
TheNumbat Mar 21, 2024
ed13892
fix asserts
TheNumbat Mar 21, 2024
2ea3d51
separate storage single and actual float32
TheNumbat Mar 22, 2024
4375a5d
merge
TheNumbat Mar 22, 2024
1da7d02
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
b6662e4
fix cmm test parser again
TheNumbat Mar 22, 2024
3e436c4
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
5b81480
fix operation unboxing
TheNumbat Mar 22, 2024
170a192
oops
TheNumbat Mar 22, 2024
d83be31
comments
TheNumbat Mar 22, 2024
ca93bd7
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
d8663f1
parsecmm again
TheNumbat Mar 22, 2024
e8e169c
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
96403b6
Merge branch 'main' into f32-fl2
TheNumbat Mar 22, 2024
4068688
merge
TheNumbat Mar 22, 2024
15944f5
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
4ff6ad9
fix
TheNumbat Mar 22, 2024
2d2498d
parsecmm again
TheNumbat Mar 22, 2024
52f66d8
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
5ea8e3e
arm
TheNumbat Mar 22, 2024
0ef407b
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Mar 22, 2024
7fbbae6
clean up commit history
TheNumbat Mar 25, 2024
6663075
normalize zero/nan in hash
TheNumbat Mar 26, 2024
3785d6a
Apply suggestions from code review
TheNumbat Mar 26, 2024
62ad563
Update backend/x86_binary_emitter.ml
TheNumbat Apr 1, 2024
3387841
address comments
TheNumbat Apr 2, 2024
134c753
restore copyright headers
TheNumbat Apr 2, 2024
9fa47cd
unify single mem chunks
TheNumbat Apr 2, 2024
d9347fe
fix size_expr f32 const
TheNumbat Apr 2, 2024
a8d5321
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 2, 2024
50aa877
merge
TheNumbat Apr 2, 2024
ccfbe36
merge fix from later pr
TheNumbat Apr 2, 2024
b0d5f27
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 2, 2024
a8f23ed
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 2, 2024
c7437ba
fix comment
TheNumbat Apr 2, 2024
ae04a8e
parsecmm...
TheNumbat Apr 2, 2024
7ea280e
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 2, 2024
d218a6f
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 2, 2024
01c848f
arm build
TheNumbat Apr 2, 2024
ac35a97
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 2, 2024
1f47ecd
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 2, 2024
b13bd15
new machtype component
TheNumbat Apr 2, 2024
300c75a
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 2, 2024
b6f6391
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 2, 2024
062b106
add ext arg test
TheNumbat Apr 2, 2024
1b19d02
fix ext args
TheNumbat Apr 2, 2024
f1daf13
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 2, 2024
bb186b2
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 2, 2024
7eaa6ec
Apply suggestions from code review
TheNumbat Apr 3, 2024
f5766a3
comments
TheNumbat Apr 3, 2024
a42f8f7
use 4b size in selectgen
TheNumbat Apr 3, 2024
380c715
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 3, 2024
9cb2f0f
merge
TheNumbat Apr 3, 2024
e38b256
real f32 ops in f32_by_bit_pattern
TheNumbat Apr 8, 2024
a7b32de
float32 parsing (untested)
TheNumbat Apr 8, 2024
050b46b
install stubs for testsuite
TheNumbat Apr 8, 2024
f899698
merge
TheNumbat Apr 8, 2024
d08462a
merge
TheNumbat Apr 8, 2024
b2ff074
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 8, 2024
5c150cd
merge arm changes
TheNumbat Apr 8, 2024
95bb9b6
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 8, 2024
7d45d1f
Update ocaml/middle_end/flambda/closure_conversion.ml
TheNumbat Apr 9, 2024
64b2663
address comments
TheNumbat Apr 9, 2024
d165586
Merge branch 'f32-fl2' of github.com:ocaml-flambda/flambda-backend in…
TheNumbat Apr 9, 2024
62d2c75
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 9, 2024
9eec365
comment
TheNumbat Apr 9, 2024
0f9dae1
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 9, 2024
bb618c0
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 9, 2024
66187bb
merge
TheNumbat Apr 9, 2024
cbd71a1
simplify locale defs
TheNumbat Apr 9, 2024
fa7870b
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 9, 2024
ebb0d6a
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 9, 2024
5834a7c
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 9, 2024
03ad39a
fix f32 alloc size
TheNumbat Apr 9, 2024
c219e35
fmt
TheNumbat Apr 9, 2024
8e9bdf8
move f32 lib to external
TheNumbat Apr 11, 2024
7f69249
fix build system
TheNumbat Apr 11, 2024
9ac6060
format
TheNumbat Apr 11, 2024
eacd280
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 11, 2024
3b9867a
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 11, 2024
983b270
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 11, 2024
b78a2e4
fix compare
TheNumbat Apr 16, 2024
f551594
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 16, 2024
4daf82d
fix static const
TheNumbat Apr 16, 2024
203d218
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 16, 2024
2925238
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 16, 2024
df8abe2
add neq_float32
TheNumbat Apr 16, 2024
ae3d4a5
preserve width in simplify comparison
TheNumbat Apr 16, 2024
e2bac20
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 16, 2024
04bd989
typo
TheNumbat Apr 16, 2024
77e83d9
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 16, 2024
a12ba29
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 16, 2024
ff8762c
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 16, 2024
2a3a4f2
typo
TheNumbat Apr 16, 2024
fc02a69
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 16, 2024
0cfb535
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 16, 2024
191df17
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 16, 2024
df6688e
another tagging fix
TheNumbat Apr 16, 2024
c50b64f
Merge branch 'f32-fl2' into f32-backend
TheNumbat Apr 16, 2024
555c85b
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 16, 2024
dc60a19
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 16, 2024
8528bab
squash
TheNumbat Apr 16, 2024
612beb2
remove pattern match
TheNumbat Apr 17, 2024
9146703
update test
TheNumbat Apr 17, 2024
4bdc1de
adjustment
TheNumbat Apr 17, 2024
ea62c43
adjustment
TheNumbat Apr 17, 2024
089a0ea
address comment
TheNumbat Apr 19, 2024
cbe341b
whitespace
TheNumbat Apr 19, 2024
d482728
preserve width in negation of other side
TheNumbat Apr 24, 2024
994435e
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 24, 2024
4bbcab3
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 24, 2024
b0ed404
merge
TheNumbat Apr 26, 2024
3db086b
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 26, 2024
4aaa542
merge
TheNumbat Apr 26, 2024
261bc62
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 26, 2024
ac288c1
fix arm build
TheNumbat Apr 26, 2024
1792a07
Merge branch 'f32-backend' into f32-fl2-ops
TheNumbat Apr 26, 2024
76ff55f
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 26, 2024
14d6d16
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 26, 2024
99b0fa8
Merge branch 'main' into f32-fl2-ops
TheNumbat Apr 26, 2024
d1908f5
caml_hash_exn
TheNumbat Apr 26, 2024
096c550
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 26, 2024
205d415
fix test
TheNumbat Apr 26, 2024
6015401
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 26, 2024
736b9b4
Refactor otherlibs build (#2477)
TheNumbat Apr 29, 2024
ef742ca
Merge branch 'main' into f32-fl2-ops
TheNumbat Apr 29, 2024
dc44058
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 29, 2024
81ed6dd
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 29, 2024
28c0b53
Merge branch 'main' into f32-fl2-ops
TheNumbat Apr 30, 2024
e6292a6
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat Apr 30, 2024
9c97040
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat Apr 30, 2024
d36692d
squash
TheNumbat Apr 30, 2024
437e322
add toplevel printing for float32
TheNumbat Apr 30, 2024
5f6c62c
fix deps
TheNumbat Apr 30, 2024
676fe93
delete f32 lit crs
TheNumbat Apr 30, 2024
0767acd
promote tests
TheNumbat Apr 30, 2024
bcab4d2
install floats stubs lib
TheNumbat May 1, 2024
3ce16c1
add s suffix to printer
TheNumbat May 1, 2024
97b165a
test
TheNumbat May 1, 2024
153d894
operator module
TheNumbat May 1, 2024
dff064a
test min_max
TheNumbat May 1, 2024
31be758
fix test
TheNumbat May 1, 2024
ff1dd31
fix float_arith kinds
TheNumbat May 1, 2024
f796c15
Merge branch 'f32-fl2-ops' into f32-backend-ops
TheNumbat May 1, 2024
ecf95a5
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat May 1, 2024
ad615d5
Merge branch 'f32-lit' into f32-lib2
TheNumbat May 1, 2024
3524b17
address comments
TheNumbat May 6, 2024
664faaf
add more string tests
TheNumbat May 7, 2024
58c751d
squash for review
TheNumbat May 7, 2024
0ca2a02
functorize binary_float_comp
TheNumbat May 7, 2024
a0b462d
merge
TheNumbat May 7, 2024
d5a8ce4
Merge branch 'f32-backend-ops' into f32-lit
TheNumbat May 7, 2024
219d2dc
merge
TheNumbat May 7, 2024
7df579a
merge
TheNumbat May 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 37 additions & 20 deletions backend/amd64/arch.ml
Original file line number Diff line number Diff line change
Expand Up @@ -139,14 +139,18 @@ type prefetch_info = {

type bswap_bitwidth = Sixteen | Thirtytwo | Sixtyfour

type float_width = Cmm.float_width

type specific_operation =
Ilea of addressing_mode (* "lea" gives scaled adds *)
Ilea of addressing_mode (* "lea" gives scaled adds *)
| Istore_int of nativeint * addressing_mode * bool
(* Store an integer constant *)
| Ioffset_loc of int * addressing_mode (* Add a constant to a location *)
| Ifloatarithmem of float_operation * addressing_mode
(* Store an integer constant *)
| Ioffset_loc of int * addressing_mode
(* Add a constant to a location *)
| Ifloatarithmem of float_width * float_operation * addressing_mode
(* Float arith operation with memory *)
| Ifloatsqrtf of addressing_mode (* Float square root from memory *)
| Ifloatsqrtf of float_width * addressing_mode
(* Float square root from memory *)
| Ibswap of { bitwidth: bswap_bitwidth; } (* endianness conversion *)
| Isextend32 (* 32 to 64 bit conversion with sign
extension *)
Expand All @@ -166,7 +170,10 @@ type specific_operation =
}

and float_operation =
Ifloatadd | Ifloatsub | Ifloatmul | Ifloatdiv
| Ifloatadd
| Ifloatsub
| Ifloatmul
| Ifloatdiv

(* Sizes, endianness *)

Expand Down Expand Up @@ -244,16 +251,23 @@ let print_specific_operation printreg op ppf arg =
(if is_assign then "(assign)" else "(init)")
| Ioffset_loc(n, addr) ->
fprintf ppf "[%a] +:= %i" (print_addressing printreg addr) arg n
| Ifloatsqrtf addr ->
| Ifloatsqrtf (Float64, addr) ->
fprintf ppf "sqrtf float64[%a]"
(print_addressing printreg addr) [|arg.(0)|]
| Ifloatarithmem(op, addr) ->
let op_name = function
| Ifloatadd -> "+f"
| Ifloatsub -> "-f"
| Ifloatmul -> "*f"
| Ifloatdiv -> "/f" in
fprintf ppf "%a %s float64[%a]" printreg arg.(0) (op_name op)
| Ifloatsqrtf (Float32, addr) ->
fprintf ppf "sqrtf float32[%a]"
(print_addressing printreg addr) [|arg.(0)|]
| Ifloatarithmem(width, op, addr) ->
let op_name = match width, op with
| Float64, Ifloatadd -> "+f"
| Float64, Ifloatsub -> "-f"
| Float64, Ifloatmul -> "*f"
| Float64, Ifloatdiv -> "/f"
| Float32, Ifloatadd -> "+f32"
| Float32, Ifloatsub -> "-f32"
| Float32, Ifloatmul -> "*f32"
| Float32, Ifloatdiv -> "/f32" in
fprintf ppf "%a %s float64[%a]" printreg arg.(0) op_name
(print_addressing printreg addr)
(Array.sub arg 1 (Array.length arg - 1))
| Ibswap { bitwidth } ->
Expand Down Expand Up @@ -360,9 +374,9 @@ let equal_prefetch_temporal_locality_hint left right =

let equal_float_operation left right =
match left, right with
| Ifloatadd, Ifloatadd -> true
| Ifloatsub, Ifloatsub -> true
| Ifloatmul, Ifloatmul -> true
| Ifloatadd, Ifloatadd
| Ifloatsub, Ifloatsub
| Ifloatmul, Ifloatmul
| Ifloatdiv, Ifloatdiv -> true
| (Ifloatadd | Ifloatsub | Ifloatmul | Ifloatdiv), _ -> false

Expand All @@ -373,11 +387,14 @@ let equal_specific_operation left right =
Nativeint.equal x y && equal_addressing_mode x' y' && Bool.equal x'' y''
| Ioffset_loc (x, x'), Ioffset_loc (y, y') ->
Int.equal x y && equal_addressing_mode x' y'
| Ifloatarithmem (x, x'), Ifloatarithmem (y, y') ->
equal_float_operation x y && equal_addressing_mode x' y'
| Ifloatarithmem (xw, x, x'), Ifloatarithmem (yw, y, y') ->
Cmm.equal_float_width xw yw &&
equal_float_operation x y &&
equal_addressing_mode x' y'
| Ibswap { bitwidth = left }, Ibswap { bitwidth = right } ->
Int.equal (int_of_bswap_bitwidth left) (int_of_bswap_bitwidth right)
| Ifloatsqrtf left, Ifloatsqrtf right ->
| Ifloatsqrtf (left_w, left), Ifloatsqrtf (right_w, right) ->
Cmm.equal_float_width left_w right_w &&
equal_addressing_mode left right
| Isextend32, Isextend32 ->
true
Expand Down
12 changes: 9 additions & 3 deletions backend/amd64/arch.mli
Original file line number Diff line number Diff line change
Expand Up @@ -68,14 +68,17 @@ type prefetch_info = {

type bswap_bitwidth = Sixteen | Thirtytwo | Sixtyfour

type float_width = Cmm.float_width

type specific_operation =
Ilea of addressing_mode (* "lea" gives scaled adds *)
| Istore_int of nativeint * addressing_mode * bool
(* Store an integer constant *)
| Ioffset_loc of int * addressing_mode (* Add a constant to a location *)
| Ifloatarithmem of float_operation * addressing_mode
| Ifloatarithmem of float_width * float_operation * addressing_mode
(* Float arith operation with memory *)
| Ifloatsqrtf of addressing_mode (* Float square root from memory *)
| Ifloatsqrtf of float_width * addressing_mode
(* Float square root from memory *)
| Ibswap of { bitwidth: bswap_bitwidth; } (* endianness conversion *)
| Isextend32 (* 32 to 64 bit conversion with sign
extension *)
Expand All @@ -95,7 +98,10 @@ type specific_operation =
}

and float_operation =
Ifloatadd | Ifloatsub | Ifloatmul | Ifloatdiv
| Ifloatadd
| Ifloatsub
| Ifloatmul
| Ifloatdiv

val equal_specific_operation : specific_operation -> specific_operation -> bool

Expand Down
107 changes: 74 additions & 33 deletions backend/amd64/emit.mlp
Original file line number Diff line number Diff line change
Expand Up @@ -570,18 +570,28 @@ let instr_for_intop = function
| Iasr -> I.sar
| _ -> assert false

let instr_for_floatop = function
| Iaddf -> I.addsd
| Isubf -> I.subsd
| Imulf -> I.mulsd
| Idivf -> I.divsd
let instr_for_floatop width op =
match width, op with
| Float64, Iaddf -> I.addsd
| Float64, Isubf -> I.subsd
| Float64, Imulf -> I.mulsd
| Float64, Idivf -> I.divsd
| Float32, Iaddf -> I.addss
| Float32, Isubf -> I.subss
| Float32, Imulf -> I.mulss
| Float32, Idivf -> I.divss
| _ -> assert false

let instr_for_floatarithmem = function
| Ifloatadd -> I.addsd
| Ifloatsub -> I.subsd
| Ifloatmul -> I.mulsd
| Ifloatdiv -> I.divsd
let instr_for_floatarithmem width op =
match width, op with
| Float64, Ifloatadd -> I.addsd
| Float64, Ifloatsub -> I.subsd
| Float64, Ifloatmul -> I.mulsd
| Float64, Ifloatdiv -> I.divsd
| Float32, Ifloatadd -> I.addss
| Float32, Ifloatsub -> I.subss
| Float32, Ifloatmul -> I.mulss
| Float32, Ifloatdiv -> I.divss

let cond = function
| Isigned Ceq -> E | Isigned Cne -> NE
Expand All @@ -600,7 +610,8 @@ let output_test_zero arg =

(* Output a floating-point compare and branch *)

let emit_float_test cmp i ~(taken:X86_ast.condition -> unit) =
let emit_float_test (width : Cmm.float_width)
cmp i ~(taken:X86_ast.condition -> unit) =
(* Effect of comisd on flags and conditional branches:
ZF PF CF cond. branches taken
unordered 1 1 1 je, jb, jbe, jp
Expand All @@ -610,46 +621,51 @@ let emit_float_test cmp i ~(taken:X86_ast.condition -> unit) =
If FP traps are on (they are off by default),
comisd traps on QNaN and SNaN but ucomisd traps on SNaN only.
*)
let ucomi, comi =
match width with
| Float64 -> I.ucomisd, I.comisd
| Float32 -> I.ucomiss, I.comiss
in
match cmp with
| CFeq when arg i 1 = arg i 0 ->
I.ucomisd (arg i 1) (arg i 0);
ucomi (arg i 1) (arg i 0);
taken NP
| CFeq ->
let next = new_label() in
I.ucomisd (arg i 1) (arg i 0);
ucomi (arg i 1) (arg i 0);
I.jp (label next); (* skip if unordered *)
taken E; (* branch taken if x=y *)
def_label next
| CFneq when arg i 1 = arg i 0 ->
I.ucomisd (arg i 1) (arg i 0);
ucomi (arg i 1) (arg i 0);
taken P
| CFneq ->
I.ucomisd (arg i 1) (arg i 0);
ucomi (arg i 1) (arg i 0);
taken P; (* branch taken if unordered *)
taken NE (* branch taken if x<y or x>y *)
| CFlt ->
I.comisd (arg i 0) (arg i 1);
comi (arg i 0) (arg i 1);
taken A (* branch taken if y>x i.e. x<y *)
| CFnlt ->
I.comisd (arg i 0) (arg i 1);
comi (arg i 0) (arg i 1);
taken BE (* taken if unordered or y<=x i.e. !(x<y) *)
| CFle ->
I.comisd (arg i 0) (arg i 1);(* swap compare *)
comi (arg i 0) (arg i 1); (* swap compare *)
taken AE (* branch taken if y>=x i.e. x<=y *)
| CFnle ->
I.comisd (arg i 0) (arg i 1);(* swap compare *)
comi (arg i 0) (arg i 1); (* swap compare *)
taken B (* taken if unordered or y<x i.e. !(x<=y) *)
| CFgt ->
I.comisd (arg i 1) (arg i 0);
comi (arg i 1) (arg i 0);
taken A (* branch taken if x>y *)
| CFngt ->
I.comisd (arg i 1) (arg i 0);
comi (arg i 1) (arg i 0);
taken BE (* taken if unordered or x<=y i.e. !(x>y) *)
| CFge ->
I.comisd (arg i 1) (arg i 0);(* swap compare *)
comi (arg i 1) (arg i 0); (* swap compare *)
taken AE (* branch taken if x>=y *)
| CFnge ->
I.comisd (arg i 1) (arg i 0);(* swap compare *)
comi (arg i 1) (arg i 0); (* swap compare *)
taken B (* taken if unordered or x<y i.e. !(x>=y) *)

let emit_test i ~(taken:X86_ast.condition -> unit) = function
Expand All @@ -669,8 +685,8 @@ let emit_test i ~(taken:X86_ast.condition -> unit) = function
| Iinttest_imm(cmp, n) ->
I.cmp (int n) (arg i 0);
taken (cond cmp)
| Ifloattest cmp ->
emit_float_test cmp i ~taken
| Ifloattest (width, cmp) ->
emit_float_test width cmp i ~taken
| Ioddtest ->
I.test (int 1) (arg8 i 0);
taken NE
Expand Down Expand Up @@ -1513,18 +1529,31 @@ let emit_instr ~first ~fallthrough i =
instr_for_intop op (int n) (res i 0)
| Lop(Iintop_atomic{op; size; addr}) ->
emit_atomic i op size addr
| Lop(Ifloatop(Icompf cmp)) ->
| Lop(Ifloatop(Float64, Icompf cmp)) ->
let cond, need_swap = float_cond_and_need_swap cmp in
let a0, a1 = if need_swap then arg i 1, arg i 0 else arg i 0, arg i 1 in
I.cmpsd cond a1 a0;
I.movq a0 (res i 0);
I.neg (res i 0)
| Lop(Ifloatop(Inegf)) ->
| Lop(Ifloatop(Float32, Icompf cmp)) ->
let cond, need_swap = float_cond_and_need_swap cmp in
let a0, a1 = if need_swap then arg i 1, arg i 0 else arg i 0, arg i 1 in
I.cmpss cond a1 a0;
I.movd a0 (res32 i 0);
(* CMPSS only sets the bottom 32 bits of the result, so we sign-extend to
copy the result to the top 32 bits. *)
I.movsxd (res32 i 0) (res i 0);
I.neg (res i 0)
| Lop(Ifloatop(Float64, Inegf)) ->
I.xorpd (mem64_rip VEC128 (emit_symbol "caml_negf_mask")) (res i 0)
| Lop(Ifloatop(Iabsf)) ->
| Lop(Ifloatop(Float64, Iabsf)) ->
I.andpd (mem64_rip VEC128 (emit_symbol "caml_absf_mask")) (res i 0)
| Lop(Ifloatop(Iaddf | Isubf | Imulf | Idivf as floatop)) ->
instr_for_floatop floatop (arg i 1) (res i 0)
| Lop(Ifloatop(Float32, Inegf)) ->
I.xorps (mem64_rip VEC128 (emit_symbol "caml_negf32_mask")) (res i 0)
| Lop(Ifloatop(Float32, Iabsf)) ->
I.andps (mem64_rip VEC128 (emit_symbol "caml_absf32_mask")) (res i 0)
| Lop(Ifloatop(width, (Iaddf | Isubf | Imulf | Idivf as floatop))) ->
instr_for_floatop width floatop (arg i 1) (res i 0)
| Lop(Iintofvalue | Ivalueofint | Ivectorcast Bits128) ->
move i.arg.(0) i.res.(0)
| Lop(Iscalarcast (Float_of_int Float64)) ->
Expand Down Expand Up @@ -1579,18 +1608,23 @@ let emit_instr ~first ~fallthrough i =
I.mov (nat n) (addressing addr QWORD i 0)
| Lop(Ispecific(Ioffset_loc(n, addr))) ->
I.add (int n) (addressing addr QWORD i 0)
| Lop(Ispecific(Ifloatarithmem(op, addr))) ->
instr_for_floatarithmem op (addressing addr REAL8 i 1) (res i 0)
| Lop(Ispecific(Ifloatarithmem(Float64, op, addr))) ->
instr_for_floatarithmem Float64 op (addressing addr REAL8 i 1) (res i 0)
| Lop(Ispecific(Ifloatarithmem(Float32, op, addr))) ->
instr_for_floatarithmem Float32 op (addressing addr REAL4 i 1) (res i 0)
| Lop(Ispecific(Ibswap { bitwidth = Sixteen })) ->
I.xchg ah al;
I.movzx (res16 i 0) (res i 0)
| Lop(Ispecific(Ibswap { bitwidth = Thirtytwo })) ->
I.bswap (res32 i 0);
| Lop(Ispecific(Ibswap { bitwidth = Sixtyfour })) ->
I.bswap (res i 0)
| Lop(Ispecific(Ifloatsqrtf addr)) ->
| Lop(Ispecific(Ifloatsqrtf (Float64, addr))) ->
I.xorpd (res i 0) (res i 0); (* avoid partial register stall *)
I.sqrtsd (addressing addr REAL8 i 0) (res i 0)
| Lop(Ispecific(Ifloatsqrtf (Float32, _addr))) ->
(* CR mslater: (float32) Ifloatsqrtf Float32 *)
Misc.fatal_error "Ifloatsqrtf Float32 should never be generated."
| Lop(Ispecific(Isextend32)) ->
I.movsxd (arg32 i 0) (res i 0)
| Lop(Ispecific(Izextend32)) ->
Expand Down Expand Up @@ -2012,6 +2046,13 @@ let begin_assembly unix =
_label (emit_symbol "caml_absf_mask");
D.qword (Const 0x7FFFFFFFFFFFFFFFL);
D.qword (Const 0xFFFFFFFFFFFFFFFFL);
_label (emit_symbol "caml_negf32_mask");
D.qword (Const 0x80000000L);
D.qword (Const 0L);
D.align ~data:true 16;
_label (emit_symbol "caml_absf32_mask");
D.qword (Const 0xFFFFFFFF7FFFFFFFL);
D.qword (Const 0xFFFFFFFFFFFFFFFFL);
end;

D.data ();
Expand Down
13 changes: 8 additions & 5 deletions backend/amd64/proc.ml
Original file line number Diff line number Diff line change
Expand Up @@ -488,7 +488,7 @@ let destroyed_at_oper = function
| Iop(Ispecific(Isextend32 | Izextend32 | Ilea _
| Istore_int (_, _, _) | Ioffset_loc (_, _)
| Ipause | Iprefetch _
| Ifloatarithmem (_, _) | Ifloatsqrtf _ | Ibswap _))
| Ifloatarithmem (_, _, _) | Ifloatsqrtf (_, _) | Ibswap _))
| Iop(Iintop(Iadd | Isub | Imul | Iand | Ior | Ixor | Ilsl | Ilsr | Iasr
| Ipopcnt | Iclz _ | Ictz _ ))
| Iop(Iintop_imm((Iadd | Isub | Imul | Imulh _ | Iand | Ior | Ixor | Ilsl
Expand Down Expand Up @@ -662,7 +662,8 @@ let max_register_pressure =
consumes ~int:(1 + num_destroyed_by_plt_stub) ~float:0
| Iintop(Icomp _) | Iintop_imm((Icomp _), _) ->
consumes ~int:1 ~float:0
| Istore(Single { reg = Float64 }, _, _) | Ifloatop (Icompf _) ->
| Istore(Single { reg = Float64 }, _, _)
| Ifloatop ((Float64 | Float32), Icompf _) ->
consumes ~int:0 ~float:1
| Ispecific(Isimd op) ->
(match Simd_proc.register_behavior op with
Expand All @@ -688,7 +689,8 @@ let max_register_pressure =
| Single { reg = Float32 } | Double
| Onetwentyeight_aligned | Onetwentyeight_unaligned),
_, _)
| Imove | Ispill | Ireload | Ifloatop (Inegf | Iabsf | Iaddf | Isubf | Imulf | Idivf)
| Imove | Ispill | Ireload
| Ifloatop ((Float64 | Float32), (Inegf | Iabsf | Iaddf | Isubf | Imulf | Idivf))
| Icsel _
| Ivalueofint | Iintofvalue | Ivectorcast _ | Iscalarcast _
| Iconst_int _ | Iconst_float _ | Iconst_float32 _
Expand All @@ -698,7 +700,8 @@ let max_register_pressure =
| Ispecific(Ilea _ | Isextend32 | Izextend32 | Iprefetch _ | Ipause
| Irdtsc | Irdpmc | Istore_int (_, _, _)
| Ilfence | Isfence | Imfence
| Ioffset_loc (_, _) | Ifloatarithmem (_, _) | Ifloatsqrtf _
| Ioffset_loc (_, _) | Ifloatarithmem (_, _, _)
| Ifloatsqrtf (_, _)
| Ibswap _)
| Iname_for_debugger _ | Iprobe _ | Iprobe_is_enabled _ | Iopaque
| Ibeginregion | Iendregion | Idls_get
Expand Down Expand Up @@ -794,7 +797,7 @@ let operation_supported = function
| Cbswap _
| Cclz _ | Cctz _
| Ccmpi _ | Caddv | Cadda | Ccmpa _
| Cnegf | Cabsf | Caddf | Csubf | Cmulf | Cdivf
| Cnegf _ | Cabsf _ | Caddf _ | Csubf _ | Cmulf _ | Cdivf _
| Cvalueofint | Cintofvalue
| Ccmpf _
| Craise _
Expand Down
Loading
Loading