Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy published crossgen2 in artifacts/tests #80154

Merged
merged 52 commits into from
Aug 24, 2024

Conversation

am11
Copy link
Member

@am11 am11 commented Jan 3, 2023

Fixes #80110

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 3, 2023
@ghost
Copy link

ghost commented Jan 3, 2023

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #80110

Author: am11
Assignees: -
Labels:

area-Infrastructure-coreclr

Milestone: -

@am11 am11 force-pushed the feature/tests/published-crossgen2 branch 2 times, most recently from 50a8fa9 to 5cb264a Compare January 4, 2023 03:31
@am11 am11 marked this pull request as ready for review January 4, 2023 03:31
@am11 am11 force-pushed the feature/tests/published-crossgen2 branch from 300ce0d to 25514ef Compare January 4, 2023 08:17
@runfoapp runfoapp bot mentioned this pull request Jan 4, 2023
@am11 am11 force-pushed the feature/tests/published-crossgen2 branch 7 times, most recently from 583f275 to 72fe8e5 Compare January 4, 2023 20:57
@am11 am11 force-pushed the feature/tests/published-crossgen2 branch 2 times, most recently from 931c0ee to db8b174 Compare January 6, 2023 14:12
@am11 am11 force-pushed the feature/tests/published-crossgen2 branch 3 times, most recently from a7d30ec to 79c6554 Compare January 7, 2023 22:56
@jeffschwMSFT jeffschwMSFT requested a review from trylek January 30, 2023 21:05
@trylek
Copy link
Member

trylek commented Feb 5, 2023

Overall looks good to me, thanks Adeel, in general I believe this to be heading in the right direction. We probably need to soften some rough edges though. It seems to me that quite a few R2R tests failed in your PR run, that will require investigation and fixing (theoretically these could be caused by the weird addition of ".." in populating CORE_ROOT I commented on in the review as I find that surprising). Please also see my comment regarding the filtering on checked build configuration. Once we're on the same page w.r.t. these aspects, I'll be more than happy to approve your change and merge it in.

@am11 am11 closed this Feb 5, 2023
@am11 am11 reopened this Feb 5, 2023
@trylek
Copy link
Member

trylek commented Feb 7, 2023

In our today .NET core runtime sync several people emphasized that we should coordinate this work with the diagnostics team to make sure we don't paint ourselves in a corner due to the fact that today NativeAOT bugs are in general harder to analyze and investigate. /cc-ing @tommcdon for visibility.

@jkoritzinsky jkoritzinsky merged commit 62835af into dotnet:main Aug 24, 2024
162 of 170 checks passed
@am11 am11 mentioned this pull request Aug 24, 2024
4 tasks
@am11 am11 deleted the feature/tests/published-crossgen2 branch August 24, 2024 18:47
@am11
Copy link
Member Author

am11 commented Aug 24, 2024

Thank you! BinaryFormatter failure was #104216.

Follow up list is available here: #106929.

@jakobbotsch
Copy link
Member

jakobbotsch commented Aug 26, 2024

With this PR the repo no longer builds for me. I get this error:

C:\dev\dotnet\runtime2\.dotnet\sdk\9.0.100-preview.7.24407.12\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.FrameworkReferenceRes
olution.targets(495,5): error NETSDK1112: The runtime pack for Microsoft.NETCore.App.Runtime.win-x64 was not downloaded. Try running
 a NuGet restore with the RuntimeIdentifier 'win-x64'. [C:\dev\dotnet\runtime2\src\coreclr\tools\aot\crossgen2\crossgen2_publish.csp
roj]

Build command is .\build.cmd clr+libs -rc checked -c release && .\src\tests\build.cmd checked generatelayoutonly.

@jakobbotsch
Copy link
Member

jakobbotsch commented Aug 26, 2024

Also, superpmi-collect fails during "Create Core_Root" step on linux-arm (and probably other platforms) with:

2024-08-25T18:07:53.9782152Z   Unhandled exception: System.DllNotFoundException: Unable to load shared library 'clrjit_universal_arm_x64' or one of its dependencies. In order to help diagnose loading problems, consider using a tool like strace. If you're using glibc, consider setting the LD_DEBUG environment variable: 
2024-08-25T18:07:53.9787522Z   /mnt/vss/_work/1/s/artifacts/bin/coreclr/linux.arm.Checked/x64/ilc/clrjit_universal_arm_x64.so: cannot open shared object file: No such file or directory
2024-08-25T18:07:53.9788209Z   /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.38' not found (required by /mnt/vss/_work/1/s/artifacts/bin/coreclr/linux.arm.Checked/x64/ilc/libclrjit_universal_arm_x64.so)
2024-08-25T18:07:53.9788573Z   libclrjit_universal_arm_x64.so: cannot open shared object file: No such file or directory
2024-08-25T18:07:53.9788916Z   /mnt/vss/_work/1/s/artifacts/bin/coreclr/linux.arm.Checked/x64/ilc/clrjit_universal_arm_x64: cannot open shared object file: No such file or directory
2024-08-25T18:07:53.9790253Z   /mnt/vss/_work/1/s/artifacts/bin/coreclr/linux.arm.Checked/x64/ilc/libclrjit_universal_arm_x64: cannot open shared object file: No such file or directory
2024-08-25T18:07:53.9790466Z   
2024-08-25T18:07:54.0025059Z      at System.Runtime.InteropServices.NativeLibrary.LoadLibraryByName(String libraryName, Assembly assembly, Nullable`1 searchPath, Boolean throwOnError)
2024-08-25T18:07:54.0025730Z      at Internal.JitInterface.JitConfigProvider.<>c__DisplayClass5_0.<Initialize>b__0(String libName, Assembly assembly, Nullable`1 searchPath) in /_/src/coreclr/tools/Common/JitInterface/JitConfigProvider.cs:line 48
2024-08-25T18:07:54.0026210Z      at System.Runtime.InteropServices.NativeLibrary.LoadLibraryCallbackStub(String libraryName, Assembly assembly, Boolean hasDllImportSearchPathFlags, UInt32 dllImportSearchPathFlags)
2024-08-25T18:07:54.0026541Z   
2024-08-25T18:07:54.0026748Z      at Internal.JitInterface.CorInfoImpl.jitStartup(IntPtr host)
2024-08-25T18:07:54.0027006Z      at Internal.JitInterface.CorInfoImpl.jitStartup(IntPtr host)
2024-08-25T18:07:54.0027324Z      at Internal.JitInterface.CorInfoImpl.Startup(CORINFO_OS os) in /_/src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs:line 177
2024-08-25T18:07:54.0027706Z      at ILCompiler.RyuJitCompilationBuilder.ToCompilation() in /_/src/coreclr/tools/aot/ILCompiler.RyuJit/Compiler/RyuJitCompilationBuilder.cs:line 127
2024-08-25T18:07:54.0028041Z      at ILCompiler.Program.Run() in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 580
2024-08-25T18:07:54.0028416Z      at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass240_0.<.ctor>b__0(ParseResult result) in /_/src/coreclr/tools/aot/ILCompiler/ILCompilerRootCommand.cs:line 293
2024-08-25T18:07:54.0028690Z      at System.CommandLine.Invocation.InvocationPipeline.Invoke(ParseResult parseResult)
2024-08-25T18:07:54.0316349Z /mnt/vss/_work/1/s/artifacts/bin/coreclr/linux.arm.Checked/build/Microsoft.NETCore.Native.targets(317,5): error MSB3073: The command ""/mnt/vss/_work/1/s/artifacts/bin/coreclr/linux.arm.Checked/x64/ilc/ilc" @"/mnt/vss/_work/1/s/artifacts/obj/coreclr/crossgen2_publish/linux.arm.Checked/native/crossgen2.ilc.rsp"" exited with code 1. [/mnt/vss/_work/1/s/src/coreclr/tools/aot/crossgen2/crossgen2_publish.csproj]
2024-08-25T18:07:54.5680688Z   Determining projects to restore...
2024-08-25T18:07:55.8919801Z   All projects are up-to-date for restore.
2024-08-25T18:07:55.9387359Z 
2024-08-25T18:07:55.9388368Z Build FAILED.

Edit: Looks like linux-arm and linux-arm64 are broken while other targets are ok.

Edit 2: I'm guessing the problem is somewhere in the following that does not end up producing the right crossgen2 bits:

- template: /eng/pipelines/common/platform-matrix.yml
parameters:
jobTemplate: /eng/pipelines/common/global-build-job.yml
buildConfig: checked
platforms:
- linux_arm
- linux_arm64
jobParameters:
testGroup: outerloop
buildArgs: -s clr+libs+libs.tests -rc $(_BuildConfig) -c Release /p:ArchiveTests=true
timeoutInMinutes: 120
postBuildSteps:
# Build CLR assets for x64 as well as the target as we need an x64 mcs
- template: /eng/pipelines/common/templates/global-build-step.yml
parameters:
buildArgs: -s clr.spmi -c $(_BuildConfig)
archParameter: -arch x64
container: linux_x64
- template: /eng/pipelines/coreclr/templates/build-native-test-assets-step.yml
- template: /eng/pipelines/common/upload-artifact-step.yml
parameters:
rootFolder: $(Build.SourcesDirectory)/artifacts/bin
includeRootFolder: false
archiveType: $(archiveType)
archiveExtension: $(archiveExtension)
tarCompression: $(tarCompression)
artifactName: BuildArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(_BuildConfig)
- template: /eng/pipelines/common/upload-artifact-step.yml
parameters:
rootFolder: $(Build.SourcesDirectory)/artifacts/helix
includeRootFolder: false
archiveType: $(archiveType)
archiveExtension: $(archiveExtension)
tarCompression: $(tarCompression)
artifactName: LibrariesTestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(_BuildConfig)
extraVariablesTemplates:
- template: /eng/pipelines/common/templates/runtimes/native-test-assets-variables.yml
parameters:
testGroup: outerloop
disableComponentGovernance: true # No shipping artifacts produced by this pipeline

@jakobbotsch
Copy link
Member

https://dev.azure.com/dnceng-public/public/_build/results?buildId=787949&view=results also appears to have some failures that look related in the "Test crossgen2-comparison X to Y" jobs.

@am11
Copy link
Member Author

am11 commented Aug 26, 2024

https://dev.azure.com/dnceng-public/public/_build/results?buildId=787949&view=results also appears to have some failures that look related in the "Test crossgen2-comparison X to Y" jobs.

Can you pinpoint the error? Those failures look to be same as #106948.

@am11
Copy link
Member Author

am11 commented Aug 26, 2024

.\build.cmd clr+libs -rc checked -c release && .\src\tests\build.cmd checked generatelayoutonly

The supported command is (and always have been):
.\build.cmd clr+libs -rc checked -c release && .\src\tests\build.cmd checked generatelayoutonly -p:LibrariesConfiguration=Release

@jakobbotsch
Copy link
Member

Can you pinpoint the error? Those failures look to be same as #106948.

For example: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-600889e483e24676bf/WorkItem/1/console.51e0e9bf.log?helixlogtype=result

.\build.cmd clr+libs -rc checked -c release && .\src\tests\build.cmd checked generatelayoutonly

The supported command is (and always have been): .\build.cmd clr+libs -rc checked -c release && .\src\tests\build.cmd checked generatelayoutonly -p:LibrariesConfiguration=Release

I have built this way for 3 years without issues. I do not see a mention that LibrariesConfiguration needs to be specified always in the workflow docs (actually I see the opposite). https://github.com/dotnet/runtime/blob/main/docs/workflow/testing/coreclr/testing.md#building-the-core_root

@am11
Copy link
Member Author

am11 commented Aug 26, 2024

From main/docs/workflow/testing/coreclr/testing.md#building-the-core_root

Note that for the libraries configuration, we are passing the argument directly to MSBuild instead of the build script, hence the /p:LibrariesConfiguration flag. Also, make sure you use the correct syntax depending on our platform. The cmd script takes the arguments by placing, while the sh script requires them to be with a hyphen.

@jakobbotsch
Copy link
Member

From main/docs/workflow/testing/coreclr/testing.md#building-the-core_root

Note that for the libraries configuration, we are passing the argument directly to MSBuild instead of the build script, hence the /p:LibrariesConfiguration flag. Also, make sure you use the correct syntax depending on our platform. The cmd script takes the arguments by placing, while the sh script requires them to be with a hyphen.

Yes, for debug libraries I understand that it needs to be specified. I never build debug libraries, so I never have needed to specify it -- section below what you quote mentions that it is only necessary if your libraries configuration is not release.

@am11
Copy link
Member Author

am11 commented Aug 26, 2024

The fix is simple (set LibrariesConfiguration = Release by default). We did ran couple of pipelines, including superpmi by making a dummy change. Apparently it wasn't enough.

/azp run runtime-coreclr crossgen2

We were using that as an ultimate test and didn't ran crossgen2 outerloop.

@jakobbotsch
Copy link
Member

The fix is simple (set LibrariesConfiguration = Release by default).

👍

We did ran couple of pipelines, including superpmi by making a dummy change. Apparently it wasn't enough.

In this case superpmi-collect is the pipeline that substantially uses crossgen2, while superpmi-diffs/superpmi-replay do not use it at all. Those latter pipelines need only JIT + SPMI builds and then download the collections created by superpmi-collect.
Sadly superpmi-collect is not public, so we (the JIT team) will have to help a bit in getting this working.

One thing that confuses me a bit is that it seems we are running ilc even when generatelayoutonly is specified. I would not expect generatelayoutonly to start running ILC and prejitting things.

/azp run runtime-coreclr crossgen2

We were using that as an ultimate test and didn't ran crossgen2 outerloop.

To be fair I am not sure that the crossgen2 outerloop pipeline is checked much. I just noticed it while trying to find a publicly accessible pipeline run that hit the same issue as superpmi-collect (since it is only internally accessible).

@jakobbotsch
Copy link
Member

Sadly superpmi-collect is not public

I think we can add a public test-only version of this pipeline that does everything except upload the collections. That should make the testing much more straightforward for contributors.

@am11
Copy link
Member Author

am11 commented Aug 26, 2024

If it is blocking, lets revert it. I think two of the fixes are striaghtforward (add a line in src/tests/Common/Directory.Build.props and update crosgen2 pipeline yamls to drop .dll).

Yup, the superpmi-collect issue needs some investigation. It maybe as simple as passing --cross to:

# Build CLR assets for x64 as well as the target as we need an x64 mcs
- template: /eng/pipelines/common/templates/global-build-step.yml
parameters:
buildArgs: -s clr.spmi -c $(_BuildConfig)
archParameter: -arch x64
container: linux_x64

or something more involved. I'll take a closer look when I am off my day job. 😅

@jakobbotsch
Copy link
Member

If it is blocking, lets revert it. I think two of the fixes are striaghtforward (add a line in src/tests/Common/Directory.Build.props and update crosgen2 pipeline yamls to drop .dll).

Yup, the superpmi-collect issue needs some investigation. It maybe as simple as passing --cross to:

# Build CLR assets for x64 as well as the target as we need an x64 mcs
- template: /eng/pipelines/common/templates/global-build-step.yml
parameters:
buildArgs: -s clr.spmi -c $(_BuildConfig)
archParameter: -arch x64
container: linux_x64

or something more involved. I'll take a closer look when I am off my day job. 😅

Thanks! I think we should revert it for now as otherwise we'll quickly end up with no SPMI collections for linux-arm/linux-arm64 while we investigate.

jakobbotsch added a commit that referenced this pull request Aug 26, 2024
am11 added a commit to am11/runtime that referenced this pull request Aug 26, 2024
am11 added a commit to am11/runtime that referenced this pull request Aug 26, 2024
jakobbotsch added a commit to jakobbotsch/runtime that referenced this pull request Aug 26, 2024
am11 added a commit to am11/runtime that referenced this pull request Aug 27, 2024
jtschuster pushed a commit to jtschuster/runtime that referenced this pull request Sep 17, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Sep 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong crossgen2 binary in artifacts/tests
9 participants