Skip to content

Incorrect code generation for arm64 with i1 vectors #53246

Closed
@cesaref

Description

@cesaref

Using LLVM 13.0.1, and the following IR:

; ModuleID = 'styx'
source_filename = "styx"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-darwin20.3.0"

%Type = type { <3 x i1>}

define void @writeValue(%Type* %v)
{
  store %Type { <3 x i1> <i1 true, i1 false, i1 true>}, %Type* %v

  ret void
}

When run through llc produces different output data in the passed Type* depending on optimisation level. With --O0 we get something sensible:

cesare@Cesares-MacBook-Pro o % llc a.ll -O0
cesare@Cesares-MacBook-Pro o % cat a.s
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 11, 0
	.globl	_writeValue                     ; -- Begin function writeValue
	.p2align	2
_writeValue:                            ; @writeValue
	.cfi_startproc
; %bb.0:
	mov	w8, #5
	strb	w8, [x0]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols
cesare@Cesares-MacBook-Pro o % 

This sets the value to 5, which is the correct result (as a packed bool vector).

However, building with -O2 produces:

cesare@Cesares-MacBook-Pro o % llc a.ll -O2                                                       
cesare@Cesares-MacBook-Pro o % cat a.s
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 11, 0
	.globl	_writeValue                     ; -- Begin function writeValue
	.p2align	2
_writeValue:                            ; @writeValue
	.cfi_startproc
; %bb.0:
	adrp	x8, __PromotedConst@PAGE
	ldrb	w8, [x8, __PromotedConst@PAGEOFF]
	and	w9, w8, #0x4
	ubfx	w10, w8, #1, #1
	and	w8, w8, #0x1
	bfi	w8, w10, #1, #1
	orr	w8, w8, w9
	strb	w8, [x0]
	ret
	.cfi_endproc
                                        ; -- End function
	.section	__TEXT,__const
	.p2align	3                               ; @_PromotedConst
__PromotedConst:
	.byte	1                               ; 0x1
	.byte	0                               ; 0x0
	.byte	1                               ; 0x1
	.space	1

.subsections_via_symbols
cesare@Cesares-MacBook-Pro o % 

Which is more complicated, but if you run it, it sets the memory to 1.

Here's a simple test program compiling against the two versions of this output (saved to a.O0.s and a.O2.s) demonstrating the different output:

jenkins@server o % cat main.cpp
#include <iostream>
#include <vector>

extern "C" void writeValue (void *buffer);

int main()
{
    int v;

    writeValue (&v);

    std::cout << v << std::endl;
}
jenkins@server o % clang -c a.O0.s; clang++ -std=c++11 main.cpp a.O0.o
jenkins@server o % ./a.out                                            
5
jenkins@server o % clang -c a.O2.s; clang++ -std=c++11 main.cpp a.O2.o
jenkins@server o % ./a.out                                            
1
jenkins@server o % 

I believe this is an arm64 code gen bug, but please correct me if i'm doing something daft.

Cesare

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions