Suboptimal Codegen for Vector128.AsVector128(Vector2)

### Description

When I was playing with Vector2 and Vector128, I felt that `Vector128.AsVector128(Vector2)` was not optimal.  

C#
```csharp
using System;
using System.Numerics;
using System.Runtime.Intrinsics;
public static class C
{
    public static Vector128<float> AsVector128(Vector2 value) => value.AsVector128();
}
```

Current Codegen by [SharpLab](https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIAYACY8gOgDkBXfGKASzFwBuGvSasAShwB2GXtxYBJGXym5+QkQGYxSJqQYBhGgG8aDc023NdANRhgM0cqQAcAHgBmAGwjYMAPgYAQVw7BydXAAowxyh9ADdsLw4YAEoGAF5AxOSYFhCYiJdI1OFqAF8gA)
```asm
; Core CLR 6.0.322.12309 on amd64

C.AsVector128(System.Numerics.Vector2)
    L0000: vzeroupper
    L0003: vmovq xmm0, rdx
    L0008: vxorps xmm1, xmm1, xmm1
    L000c: vinsertps xmm0, xmm0, xmm1, 0x20
    L0012: vxorps xmm1, xmm1, xmm1
    L0016: vinsertps xmm0, xmm0, xmm1, 0x30
    L001c: vmovupd [rcx], xmm0
    L0020: mov rax, rcx
    L0023: ret

```

## Expected Codegen

For separated methods that pass a value in `rdx`:
```asm
vzeroupper
vmovq xmm0, rdx  ;automatically clears all the upper bits in xmm0
vmovdqu [rcx], xmm0
mov rax, rcx
ret
```

For separated methods that pass a reference to a value as `rdx`:
```asm
vzeroupper
vmovsd xmm0, [rdx]  ;automatically clears all the upper bits in xmm0
vmovupd [rcx], xmm0
mov rax, rcx
ret
```

If it's inlined and `rdx` has the value:
```asm
vmovq xmm0, rdx  ;automatically clears all the upper bits in xmm0
```

If it's inlined and `xmm1` has the value:
```asm
vmovddup xmm0, xmm1  ;if later calculation don't really care about upper 64 bits
```
or if it's necessary to clear upper 64 bits: 
```asm
vxorps xmm0, xmm0, xmm0  ;clear xmm0 by hand
vmovsd xmm0, xmm0, xmm1 ;merge lower 64bits of xmm1
```

If it's inlined and `rsi` has the reference to the value:
```asm
vmovsd xmm0, [rsi]  ;automatically clears all the upper bits in xmm0
```

### Configuration

[SharpLab](https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIAYACY8gOgDkBXfGKASzFwBuGvSasAShwB2GXtxYBJGXym5+QkQGYxSJqQYBhGgG8aDc023NdANRhgM0cqQAcAHgBmAGwjYMAPgYAQVw7BydXAAowxyh9ADdsLw4YAEoGAF5AxOSYFhCYiJdI1OFqAF8gA) (2022/05/13)

### Regression?

No

### Data

### Analysis

[This code in Vector128.cs](https://github.com/dotnet/runtime/blob/ea004343cd3ac6a0fbd01bc38fa2be995457f8b2/src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs#L248) might be a problem.
```csharp
        public static Vector128<float> AsVector128(this Vector2 value)
            => new Vector4(value, 0.0f, 0.0f).AsVector128();
```
This implementation would not be efficient because `new Vector4(Vector2, float, float)` would emit two `vinsertps` instructions that would be unnecessary when inserting zeros.
[SharpLab](https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIAYACY8gOgDkBXfGKASzFwBuGvSasAShwB2GXtxYBhCPgAOvADY8AyjwBu/GEJGNmLSTLkwWASRl8pufkdoBmMUiakGCmgG8aDIFMbsweAGowYBjQKN7RUAAUEVHQXrrY6hwwaAwAZuoQ2BgMAF45+YXFAO4AlAwAvAB8DFIwVQnpmdmlObXC1AC+QA===)
```csharp
using System;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics;
public static class C
{
    public static Vector4 Ctor(Vector2 value, float z, float w) => new(value, z, w);
}
```
```asm
C.Ctor(System.Numerics.Vector2, Single, Single)
    L0000: vzeroupper
    L0003: vmovq xmm0, rdx
    L0008: vinsertps xmm0, xmm0, xmm2, 0x20
    L000e: vinsertps xmm0, xmm0, xmm3, 0x30
    L0014: vmovupd [rcx], xmm0
    L0018: mov rax, rcx
    L001b: ret
```
The same is true for `Vector128.AsVector128(Vector3)`.  

category:cq
theme:vector-codegen
skill-level:intermediate
cost:small
impact:small

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Suboptimal Codegen for Vector128.AsVector128(Vector2) #69298

Description

Expected Codegen

Configuration

Regression?

Data

Analysis

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Suboptimal Codegen for Vector128.AsVector128(Vector2) #69298

Description

Description

Expected Codegen

Configuration

Regression?

Data

Analysis

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions