Skip to content

Sync with upstream main branch #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 98 commits into from

Conversation

joshpeterson
Copy link

This is an automatically generated pull request to merge changes from the upstream main branch.

MichalStrehovsky and others added 30 commits January 1, 2022 10:33
The test got renamed in dotnet#63178.

Should fix the Mono AOT CI failures seen in dotnet#63232.
The real build now happens in runtime/CMakeLists.txt, the Makefile
contains only helper targets now.
When `DiagnosticName` was introduced into the type system, I didn't want to deal with it and compiled it out of the NativeAOT version of the type system.

In order to have a single ILCompiler.TypeSystem assembly that can be used with both crossgen2 and ILC, this needs to be implemented.

I've also reduced the number of diffs between ILCompiler.TypeSystem.csproj and ILCompiler.TypeSystem.ReadyToRun.csproj.
Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
…63065)

* [mono][wasm] Allow methods with finally clauses to be AOTed.

This is implemented by running the finally clause with the interpreter.
Methods with clauses have additional code generated, which:
* Saves the IL state (pc+arguments+locals) into a MonoMethodILState
structure.
* Pushes an LMF frame on the LMF stack of type MONO_LMFEXT_IL_STATE.
  The LMF frame points to the il state.

During EH, if such an LMF frame is found, and the IL pc in the
il state points inside a clause, then an interpreted version
of the method is created, and the finally clause is ran using
the interpreter using the il state as the starting state.

* Disable a few test suites which now cause emscripten to OOM when building with AOT.
…#63280)

IL generation (stubs/thunks) is not part of the core type system and these files are not included in ILCompiler.TypeSystem.ReadyToRun. Somehow we accumulated them in ILCompiler.TypeSystem but they can be pretty cleanly moved to ILCompiler.Compiler (left one TODO for a subsequent cleanup since some of what's in Common\TypeSystem should actually be in ILCompiler.Compiler proper).
* Description of DebuggerBrowsable behavior.

* Added test for browse attributes.

* Corrected typos in the doc.

* Added Browse Never feature. Corrected Collapse test. ToDo: RootHidden.

* Draft of RootHidden solution.

* Added Array to test cases as it behaves differently than Collection.

* Added name concatenation to make array/list elemetns in debug window unique.

* Update docs/design/mono/debugger.md

Co-authored-by: Ankit Jain <radical@gmail.com>

* Applied PR review suggestions.

* Added a reference to regular Browsable attribute behavior in .net.

* Applied most of review suggestions.

* Stopping GetFieldsValue early.

* Remove unintentional change to the original code.

* Do not skip fields that don't have browsable attributes.

* Changing the expected behavior to match Console Application. EventHandlers are Browsable.Never by default.

* Changed the place of checking if objetc is an array.

* Update src/mono/wasm/debugger/DebuggerTestSuite/EvaluateOnCallFrameTests.cs

Co-authored-by: Ankit Jain <radical@gmail.com>

* Removed unused variables.

* Removing space and unused import.

* Partially addressed @radical comments.

* Addressed the comment about extension instead of Union.

* Removed string cultural vunerability.

* Added Properties dictionary, the same as for fields.

* Fixed the bug I made by using dynamc.

* Applying @radical comments about refactoring.

* Corrected typo.

* Added tests for properties.

* Draft of changes for properties handling - never and root hidden failing.

* Fix for RootHidden properties.

* Added tests for static fields decorated with Browsable.

* Correct a typo.

* Undo merge unintentional changes.

* Changing expected behavior for MulticastDelegateTest - in Console Application EventHandler is Browsable.Never by default so we should not expect it to be visible in the debug window.

* Removing not relevant changes created after merge with main.

* Remove file added in merge with main.

* Revert "Removing not relevant changes created after merge with main."

This reverts commit b1acf8b.

* Revert.

* Revert revert.

* One broken test for custom getter.

* Ugly fix to make all the tests work.

* Refactored JArray aggregation to Dictionary.

* Better naming.

* Remove not connected to PR file.

* Applied @thaystg suggestions.

* Removed comments.

Co-authored-by: Ankit Jain <radical@gmail.com>
…otnet#63281)

After this and dotnet#63280 there will be no differences between ILCompiler.TypeSystem and ILCompiler.TypeSystem.ReadyToRun and we can unify them.
It can have so many locals that zero-initing is measurable.
Rename it to parent_ and add m_field_get_parent / m_field_set_parent accessors.

(The intention is to borrow the bottom bit of the pointer for an EnC metadata
update flag)
They don't have to differ in the `--parallelism` vs `--singlethreaded` argument.
…od_union_preclean (dotnet#63293)

Fixes mono/mono#21369
Related to dotnet/android#6546

job_major_mod_union_preclean can race with the tarjan bridge
implementation that changes the vtable pointer by settings the three
lower bits. this results in invalid loading of the vtable
(shifted by 7 bytes)  which in turn give a wrong desc to the scan
functions

This change is released under the MIT license.

Co-authored-by: tmijieux <tmijieux@users.noreply.github.com>
…ations (dotnet#61185)

* Added logic for default interface method traversal to ILVerify method discovery

* Added Tests for DefaultImplFix

* Moved call to default interface impl resolution outside of ResolveInterfaceMethodTarget
* Update COM host to match RegAsm registration behavior
Replaced with a single ILCompiler.TypeSystem shared between crossgen2 and ilc.
…net#62973)

* make sure OpenSSL is initialized before Tls13Supported code runs

* feedback from review

* Update src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.Ssl.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>
* fix TryGetAddrInfo_HostName_TryGetNameInfo()

* Fix network test
…t#62958)

* Extend CPU capabilities detection for osx-arm64 (dotnet#62832)

* Revert uncoditional enable for dczva on osx-arm64
* Do addition for EndZ matching at compile time

* Tweak rendering of optional loops to say "Optional" rather than "Loop optionally"

* Remove "at least X" from loop description when X is 0

* Add a missing blank line at the beginning of a back reference

* Rename ReturnFalse to NoStartingPositionFound

* Delete stale comments

* Address PR feedback
elinor-fung and others added 22 commits January 6, 2022 17:10
…dleOnStack/QCallTypeHandle (dotnet#62141)

* [mono] Make some icalls return objects using an extra ObjectHandleOnStack argument.

* Fix the passing of scalar vtypes on wasm.

* Convert RuntimeTypeHandle icalls to receive a QCallTypeHandle.

Also do some other cleanups:
* Convert some icalls which don't receive/return objects any more and don't
  return an error to NOHANDLES.
* Implement IsGenericVariable in managed code.
* Add a separate GetMetadataToken icall to avoid the REUSE_WRAPPER stuff.
* Sync the implementation of IsTypeDefinition with coreclr.

* Convert some RuntimeType icalls to use QCallTypeHandles/ObjectHandleOnStack.

* Fix RuntimeType.GetPacking () to work with dynamic types.

* Convert internal_from_name to use QCallTypeHandles/ObjectHandleOnStack.

* Convert more icalls.

* Update src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/QCallHandles.cs

Co-authored-by: Aleksey Kliger (λgeek) <akliger@gmail.com>

* Remove unused argument from RuntimeTypeHandle:internal_from_name ().

* Add support for FRAME_TYPE_IL_STATE to the metadata stack walker code.

* Update coding style in QCallTypeHandle:.ctor ().

* Revert "Convert more icalls."

* Convert some Delegate icalls.

* Convert some Enum icalls.

* Convert some Marshal icalls.

* Avoid creating RuntimeType objects while AOTing.

* Convert some RuntimeType icalls.

Co-authored-by: Aleksey Kliger (λgeek) <akliger@gmail.com>
If we have a non-inline candidate call with generics context, save
the context so it's available for late devirtualization.

Fixes a missing devirtualization reported in dotnet#63283.

I am deliberately leaving `LateDevirtualizationInfo` more general than
necessary as it may serve as a jumping-off point for enabling late
inlining.
…t#63431)

* Use "read" to fetch misaligned 64bit values

* remove read_direct

* rename read() -> read_byte()
Co-authored-by: qiaopengcheng <qiaopengcheng-hf@loongson.cn>
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: kasperk81 <83082615+kasperk81@users.noreply.github.com>
* Clarify P/Invoke shims guidance for OOB assemblies

The discussion in PR dotnet#63421 clarified that System.Native shims for UNIX
APIs aren't appropriate for assemblies that don't ship as part of the
Microsoft.NETCore.App framework.

Updating the interop guidelines to capture that clarification.

* Update docs/coding-guidelines/interop-guidelines.md

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>
`init-vs-env.cmd` is going to set up the environment for x86 tools (the "VS Developer Command Prompt") and that messes with things if they run before we run vcvarsall to set the actual target architecture, so it's better not to have such window. We might not even run the vcvarsall line if native test build is skipped.

As an additional improvement, init-vs-env.cmd can also run vcvarsall and set up CMake, so I'm asking it to do it (done by passing an extra parameter to the script to specify which environment to activate).

I need this change for a subsequent change that is broken if we have x86 tools set up.
(1) Reduce contention on the pool lock. This is done mainly by (a) not calling TrySetResult on the queued waiter under the lock -- instead, do this outside the lock and retry as necessary for canceled requests; (b) avoid doing diagnostic logging under the lock.
(2) Improve handling of failed connection attempts so we don't fail requests unnecessarily.
* Fix NETStandard library using JSON source gen

NETStandard libraries using JSON source gen would fail to load on
.NETCore due to missing IsExternalInit in the assembly.

On .NETCore this is defined, whereas on .NETStandard JSON carries an
internal copy.  The compiler emits references to the internal type when
a NETStandard library calls init-only setters in JSON types, but these
references are not satisfied when the library runs on .NETCore.

Fix this by adding a type forward to JSON for the IsExternalInit type.

* Address feedback

* Update src/libraries/System.Text.Json/tests/System.Text.Json.SourceGeneration.Tests/NETStandardContextTests.cs

Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>

Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>
This is a stopgap measure to make merged test logs easier to read.
Jeremy considers a broader cleanup of the test wrapper generator
that may ultimately supersede or replace this change.

Thanks

Tomas
…net#63482)

* Fix exception propagation over HW exception frame on macOS arm64

There is a bug in setting up the fake stack frame for
the PAL_DispatchExceptionWrapper. The FP and SP were not saved
to the stack frame and the FP of the context was not set to
match the fake prologue. That caused failure to unwind over the
PAL_DispatchExceptionWrapper, reaching an unrelated native
function.

This change fixes it.

* Add regression tests for the issue

* Real fix of the issue

Unifies the hardware exception frame unwinding with Linux,
it is necessary on arm64 to get correct and distinct LR and
PC in the frame of the hardware exception.
* Fix translation of Ping error codes

* Modify SendPingToExternalHostWithLowTtlTest

* Usage of TtlReassemblyTimeExceeded

* Revert "Usage of TtlReassemblyTimeExceeded"

This reverts commit 9194df0.

* Eliminate branch and fall-back to default

* Style change: Usage of switch expressions

* Style change : Usage of switch expressions

* Revert "Modify SendPingToExternalHostWithLowTtlTest"

This reverts commit fd76e9d.
@joshpeterson
Copy link
Author

It looks like dotnet#63482 introduced an issue. It is being reverted upstream , so we will skip this sync up and try again next weekend.

@joshpeterson joshpeterson deleted the bot-upstream-main-merge-2022-01-08 branch January 11, 2022 01:17
yamato-ci-bot pushed a commit that referenced this pull request Jan 15, 2022
…otnet#63598)

* Fix native frame unwind in syscall on arm64 for VS4Mac crash report.

Add arm64 version of StepWithCompactNoEncoding for syscall leaf node wrappers that have compact encoding of 0.

Fix ReadCompactEncodingRegister so it actually decrements the addr.

Change StepWithCompactEncodingArm64 to match what MacOS libunwind does for framed and frameless stepping.

arm64 can have frames with the same SP (but different IPs). Increment SP for this condition so createdump's unwind
loop doesn't break out on the "SP not increasing" check and the frames are added to the thread frame list in the
correct order.

Add getting the unwind info for tail called functions like this:

__ZL14PROCEndProcessPvji:
   36630:       f6 57 bd a9     stp     x22, x21, [sp, #-48]!
   36634:       f4 4f 01 a9     stp     x20, x19, [sp, #16]
   36638:       fd 7b 02 a9     stp     x29, x30, [sp, #32]
   3663c:       fd 83 00 91     add     x29, sp, #32
...
   367ac:       e9 01 80 52     mov     w9, #15
   367b0:       7f 3e 02 71     cmp     w19, #143
   367b4:       20 01 88 1a     csel    w0, w9, w8, eq
   367b8:       2e 00 00 94     bl      _PROCAbort
_TerminateProcess:
-> 367bc:       22 00 80 52     mov     w2, #1
   367c0:       9c ff ff 17     b       __ZL14PROCEndProcessPvji

The IP (367bc) returns the (incorrect) frameless encoding with nothing on the stack (uses an incorrect LR to unwind). To fix this
get the unwind info for PC -1 which points to PROCEndProcess with the correct unwind info. This matches how lldb unwinds this frame.

Always address module segment to IP lookup list instead of checking the module regions.

Strip pointer authentication bits on PC/LR.
yamato-ci-bot pushed a commit that referenced this pull request Feb 12, 2022
# Local heap optimizations on Arm64

1. When not required to zero the allocated space for local heap (for sizes up to 64 bytes) - do not emit zeroing sequence. Instead do stack probing and adjust stack pointer:

```diff
-            stp     xzr, xzr, [sp,#-16]!
-            stp     xzr, xzr, [sp,#-16]!
-            stp     xzr, xzr, [sp,#-16]!
-            stp     xzr, xzr, [sp,#-16]!
+            ldr     wzr, [sp],#-64
```

2. For sizes less than one `PAGE_SIZE` use `ldr wzr, [sp], #-amount` that does probing at `[sp]` and allocates the space at the same time. This saves one instruction for such local heap allocations:

```diff
-            ldr     wzr, [sp]
-            sub     sp, sp, #208
+            ldr     wzr, [sp],#-208
```

Use `ldp tmpReg, xzr, [sp], #-amount` when the offset not encodable by post-index variant of `ldr`:
```diff
-            ldr     wzr, [sp]
-            sub     sp, sp, dotnet#512
+            ldp     x0, xzr, [sp],#-512
```

3. Allow non-loop zeroing (i.e. unrolled sequence) for sizes up to 128 bytes (i.e. up to `LCLHEAP_UNROLL_LIMIT`). This frees up two internal integer registers for such cases:

```diff
-            mov     w11, #128
-                                               ;; bbWeight=0.50 PerfScore 0.25
-G_M44913_IG19:        ; gcrefRegs=00F9 {x0 x3 x4 x5 x6 x7}, byrefRegs=0000 {}, byref, isz
             stp     xzr, xzr, [sp,#-16]!
-            subs    x11, x11, #16
-            bne     G_M44913_IG19
+            stp     xzr, xzr, [sp,#-112]!
+            stp     xzr, xzr, [sp,#16]
+            stp     xzr, xzr, [sp,#32]
+            stp     xzr, xzr, [sp,#48]
+            stp     xzr, xzr, [sp,#64]
+            stp     xzr, xzr, [sp,#80]
+            stp     xzr, xzr, [sp,#96]
```

4. Do zeroing in ascending order of the effective address:

```diff
-            mov     w7, #96
-G_M49279_IG13:
             stp     xzr, xzr, [sp,#-16]!
-            subs    x7, x7, #16
-            bne     G_M49279_IG13
+            stp     xzr, xzr, [sp,#-80]!
+            stp     xzr, xzr, [sp,#16]
+            stp     xzr, xzr, [sp,#32]
+            stp     xzr, xzr, [sp,#48]
+            stp     xzr, xzr, [sp,#64]
```

In the example, the zeroing is done at `[initialSp-16], [initialSp-96], [initialSp-80], [initialSp-64], [initialSp-48], [initialSp-32]` addresses. The idea here is to allow a CPU to detect the sequential `memset` to `0` pattern and switch into write streaming mode.
bholmes pushed a commit that referenced this pull request Nov 3, 2022
…6616)

* Support Arm64 "constructed" constants in SuperPMI asm diffs

SuperPMI asm diffs tries to ignore constants that can change between
multiple replays, such as addresses that the replay engine must generate
and not simply hand back from the collected data.

Often, addresses have associated relocations generated during replay.
SuperPMI can use these relocations to adjust the constants to allow
two replays to match. However, there are cases on Arm64 where an address
both doesn't report a relocation and is "constructed" using multiple
`mov`/`movk` instructions.

One case is the `allocPgoInstrumentationBySchema()`
API which returns a pointer to a PGO data buffer. An address within this
buffer is constructed via a sequence such as:
```
mov     x0, dotnet#63408
movk    x0, dotnet#23602, lsl #16
movk    x0, dotnet#606, lsl #32
```

When SuperPMI replays this API, it constructs a new buffer and returns that
pointer, which is used to construct various actual addresses that are
generated as "constructed" constants, shown above.

This change "de-constructs" the constants and looks them up in the replay
address map. If base and diff match the mapped constants, there is no asm diff.

* Fix 32-bit build

I don't think we fully support 64-bit replay on 32-bit host, but this
fix at least makes it possible for this case.

* Support more general mov/movk sequence

Allow JIT1 and JIT2 to have a different sequence of
mov/movk[/movk[/movk]] that map to the same address in the
address map. That is, the replay constant might require a different
set of instructions (e.g., if a `movk` is missing because its constant
is zero).
mrvoorhe pushed a commit that referenced this pull request Mar 19, 2025
…3178)

Based on the new `FIELD_LIST` support for returns this PR adds support for the
JIT to combine smaller fields via bitwise operations when returned, instead of
spilling these to stack.

win-x64 examples:
```csharp
static int? Test()
{
    return Environment.TickCount;
}
```

```diff
        call     System.Environment:get_TickCount():int
-       mov      dword ptr [rsp+0x24], eax
-       mov      byte  ptr [rsp+0x20], 1
-       mov      rax, qword ptr [rsp+0x20]
-						;; size=19 bbWeight=1 PerfScore 4.00
+       mov      eax, eax
+       shl      rax, 32
+       or       rax, 1
+						;; size=15 bbWeight=1 PerfScore 2.00
```
(the `mov eax, eax` is unnecessary, but not that simple to get rid of)

 ```csharp
static (int x, float y) Test(int x, float y)
{
    return (x, y);
}
```

```diff
-       mov      dword ptr [rsp], ecx
-       vmovss   dword ptr [rsp+0x04], xmm1
-       mov      rax, qword ptr [rsp]
+       vmovd    eax, xmm1
+       shl      rax, 32
+       mov      ecx, ecx
+       or       rax, rcx
 						;; size=13 bbWeight=1 PerfScore 3.00
```

An arm64 example:
```csharp
static Memory<int> ToMemory(int[] arr)
{
    return arr.AsMemory();
}
```

```diff
 G_M45070_IG01:  ;; offset=0x0000
-            stp     fp, lr, [sp, #-0x20]!
+            stp     fp, lr, [sp, #-0x10]!
             mov     fp, sp
-            str     xzr, [fp, #0x10]	// [V03 tmp2]
-						;; size=12 bbWeight=1 PerfScore 2.50
-G_M45070_IG02:  ;; offset=0x000C
+						;; size=8 bbWeight=1 PerfScore 1.50
+G_M45070_IG02:  ;; offset=0x0008
             cbz     x0, G_M45070_IG06
 						;; size=4 bbWeight=1 PerfScore 1.00
-G_M45070_IG03:  ;; offset=0x0010
-            str     x0, [fp, #0x10]	// [V07 tmp6]
-            str     wzr, [fp, #0x18]	// [V08 tmp7]
-            ldr     x0, [fp, #0x10]	// [V07 tmp6]
-            ldr     w0, [x0, #0x08]
-            str     w0, [fp, #0x1C]	// [V09 tmp8]
-						;; size=20 bbWeight=0.80 PerfScore 6.40
-G_M45070_IG04:  ;; offset=0x0024
-            ldp     x0, x1, [fp, #0x10]	// [V03 tmp2], [V03 tmp2+0x08]
-						;; size=4 bbWeight=1 PerfScore 3.00
-G_M45070_IG05:  ;; offset=0x0028
-            ldp     fp, lr, [sp], #0x20
+G_M45070_IG03:  ;; offset=0x000C
+            ldr     w1, [x0, #0x08]
+						;; size=4 bbWeight=0.80 PerfScore 2.40
+G_M45070_IG04:  ;; offset=0x0010
+            mov     w1, w1
+            mov     x2, xzr
+            orr     x1, x2, x1,  LSL #32
+						;; size=12 bbWeight=1 PerfScore 2.00
+G_M45070_IG05:  ;; offset=0x001C
+            ldp     fp, lr, [sp], #0x10
             ret     lr
 						;; size=8 bbWeight=1 PerfScore 2.00
-G_M45070_IG06:  ;; offset=0x0030
-            str     xzr, [fp, #0x10]	// [V07 tmp6]
-            str     xzr, [fp, #0x18]
+G_M45070_IG06:  ;; offset=0x0024
+            mov     x0, xzr
+            mov     w1, wzr
             b       G_M45070_IG04
-						;; size=12 bbWeight=0.20 PerfScore 0.60
+						;; size=12 bbWeight=0.20 PerfScore 0.40
```
(sneak peek -- this codegen requires some supplementary changes, and there's
additional opportunities here)

This is the return counterpart to dotnet#112740. That PR has a bunch of regressions
that makes it look like we need to support returns/call arguments first, before
we try to support parameters.

There's a few follow-ups here:
- Support for float->float insertions (when a float value needs to be returned
  as the 1st, 2nd, .... field of a SIMD register)
- Support for coalescing memory loads, particularly because the fields of the
  `FIELD_LIST` come from a promoted struct that ended up DNER. In those cases we
  should be able to recombine the fields back to a single large field, instead
  of combining them with bitwise operations.
- Support for constant folding the bitwise insertions. This requires some more
  constant folding support in lowering.
- The JIT has lots of (now outdated) restrictions based around multi-reg returns
  that get in the way. Lifting these should improve things considerably.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.