-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix Alpine Target RID #40843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Alpine Target RID #40843
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the RID change though?
I have the same question as @mthalman. This seems very concerning from a maintenance standpoint. Also, why are we using Alpine RIDs at all? |
Hm I'm not sure. Looks like when the build starts, the build environment is |
I'm not entirely sure how these RIDs are being used. We're no longer using RIDs like these "in the product". We've switched entirely to portable RIDs like Context: dotnet/runtime#83246 If we need a solution that describes a specific OS, it would be good to call them something other than RIDs. |
The issue is that the _DistroRid value was changed in dotnet/runtime#100580. This was then ported to Arcade in dotnet/arcade#14752. Not sure how to proceed given this finding. A simple fix would be to add the calculation in That said, I'm not entirely sure why this change in calculation was made to begin with. @am11 - Can you give guidance here please?
The issue of using a portable rid like |
OK. I'm getting a bit more clarity with those links. Let's we set the RID as |
Explicitly specifying the TargetRid for non-portable builds is the right way to go about it.
@ellahathaway, the caller of VMR's build.sh can pass |
There would be no fallback from 3.19.1 since there's no RID graph, right? We wouldn't want 3.19.1 for our build anyway because we'd just have to update the reference RID again when 3.19.2 is released. I think we'd want to explicitly set |
And that stable value can be either |
The fallback is the portable platform, which in this case is linux-musl. The fallback to linux-musl is linux and so on. |
We still have a RID graph. It's just WAY smaller. The link I provided documents it. |
To solve this issue, the question I'm currently trying to answer is whether our assets are portable or not. To my knowledge, if assets do not have dependencies on native libraries or do not have runtime dependencies that vary by operating system or other runtime-environment characteristics, then they are portable. Given the above definition, I think that our assets are portable. If this thought is true and our assets are in fact portable, then we can use a portable RID without issue. Otherwise, we need to know if using the portable RID means that our assets are also portable. It is the latter question that I do not have the answers to yet. |
The typical test is whether the assets are bound to a single version of OpenSSL. If that's the case, then they are distro-specific. If they can tolerate running on multiple versions, they are portable. |
Normally, in the binlogs, if PortableBuild is evaluated to true, then that's the easiest way to tell the difference. Other toggles are derived from there. Another way is to run dotnet with corehost tracing variable I said normally because technically someone can mix and match these concepts by passing some internal properties, which make the build look partially portable, but it's not the normal course of action. 😅 |
Pragmatically, we want two patterns:
There are at least two problems with super specific RIDs:
I know that Red Hat has had this same problem. I'm curious how they solved it. |
If Microsoft source builds should be using non-portable RIDs, then we should be using linux-musl-x64 in this place. For distro maintainers, they will call If Microsoft source builds indeed need non-portable RIDs for some reason, we can pass We removed this handling from arcade to simplify the process which prevents us from maintaining many distro infos and their preferred style of 0,1,2,3 parts of versions in RID. |
We haven't solved it yet, but I plan to use an explicit |
@omajid, just an FYI, we can cleanup this special handling from VMR as well: https://github.com/dotnet/dotnet/blob/41de9105d46ea7d01fd14a10ab5a12d1474bbc75/Directory.Build.props#L54 and make --outputrid a required arg (maps to -p:TargetRID) to keep things really explicit. I've found cycles to do the cleanup and may not get around it anytime soon, so when you start looking into it, go ahead and do the honors. ;) |
When it makes sense, please use the GNU configure terms, i.e. the outputrid CLI arg should probably just be called |
Can't agree more @ViktorHofer! The term __DistroRid mapping to TargetRid, which in turn mapping to OuputRID is not needed after the recent simplifications made in this area. So |
I wanted to give a heads up on some testing I did of the source-build build using portable RIDs: I ran a build where I set the TargetRid property to the portable RIDs for centos and alpine (linux-x64 and linux-musl-x64). The centos build was successful, but the Alpine build encountered a failure during the aspnetcore build process. The build failed because aspnetcore sets the TargetOsName as 'linux', and this is then used to determine the version of crossgen to use. Consequently, aspnetcore expects to use the linux version of crossgen, when in reality, the linux-musl version is being produced. Furthermore, after building SB on centos9 with TargetRid=linux-x64, I tried to run the produced sdk on a prereqs centos8 container. I received the following output when doing this: [root@dc446d83b92f dotnet]# .dotnet/dotnet
.dotnet/dotnet: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by .dotnet/dotnet)
.dotnet/dotnet: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by .dotnet/dotnet) Seems like a container issue rather than an issue with the sdk having a portable RID because I was able to successfully run the produced sdk on a prereqs ubuntu-22.04 container. |
I think this illustrates a problem with naming in the context of of CI. The To get something that works like a general Should we force portable-build mode when a portable RID is specified? Should we check and fail if we are building with a portable RID but the glibc we are building against is newer than our actual baseline? |
VMR defaults to false, so we just need to pass the prop: If reverse mapping is cheap, we can do it, but PortableBuild is the property we have for a while to override the flavor when needed. |
This is a fantastic point and something that I think is missing in our current infra. IMO, if we apply a RID that has certain properties, the build should follow those properties... it's clear now that this is not the case. We'd likely have to set other properties in the build configuration (i.e. -p:PortableBuild=true). I think that requiring developers to set these properties really hurts the out of the box experience. Rather, as you suggest Omair, maybe there should be infra in place to force portable builds or fail building if certain conditions are not met when we set the TargetRID to be something portable.
I understand this. I think that the question I'm trying to answer is whether or not our artifacts are actually portable (is source-build actually portable if we set the portable build property to true). I don't have the answer to this question. Overall, I worry also about the upcoming preview5 deadline. If we release preview5 with our alpine artifact as 3.19.1, it's going to cause breaks in our source-build pipeline (something similar happened when we went from alpine 3.18 to alpine 3.19). Given the current focus on other work, I'd like to avoid a setback like this and come to a solution quickly. My current proposed solution is to specify the target RID to be the correct 3.19 version for alpine. This is a quick solution (albeit a bit hacky) and gives us time to return to the greater issue at hand when there's a bit more time. |
Yes. PortableBuild=true makes the build portable and PortableBuild=false makes it non-portable. |
Test build: https://dev.azure.com/dnceng/internal/_build/results?buildId=2457692&view=results (internal Microsoft link) |
The test build failed. The issue for Alpine is the same as what I had described earlier (crossgen gets produced as linux-musl and aspnetcore expects crossgen to be linux). As for the centos legs, those fail when running the scenario tests. The error is:
cc @mthalman |
@ellahathaway, looking at https://github.com/dotnet/dotnet/blob/1f01ca0ae82f4cd3a74756ef8b94c99f85290bf7/repo-projects/aspnetcore.proj#L21, for aspnetcore it is setting |
@am11 - Thanks for your help! I logged dotnet/source-build#4418 and dotnet/source-build#4419 so that these conversations can be continued there rather than in this PR. I've updated this PR with a quick fix to unblock SB. |
/backport to release/9.0.1xx-preview5 |
Started backporting to release/9.0.1xx-preview5: https://github.com/dotnet/sdk/actions/runs/9367651838 |
@mthalman backporting to release/9.0.1xx-preview5 failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: Fix Alpine Target RID
Using index info to reconstruct a base tree...
M src/SourceBuild/content/eng/pipelines/source-build-sdk-diff-tests.yml
Falling back to patching base and 3-way merge...
Auto-merging src/SourceBuild/content/eng/pipelines/source-build-sdk-diff-tests.yml
CONFLICT (content): Merge conflict in src/SourceBuild/content/eng/pipelines/source-build-sdk-diff-tests.yml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 Fix Alpine Target RID
Error: The process '/usr/bin/git' failed with exit code 128 Please backport manually! |
@mthalman an error occurred while backporting to release/9.0.1xx-preview5, please check the run log for details! Error: git am failed, most likely due to a merge conflict. |
Closes dotnet/source-build#4393
Looking at the artifacts produced in this build (internal Microsoft SDK), the alpine target RID was
3.19-x64
but is now3.19.1-x64
. This needs to be updated in the sdk diff test pipeline so that the correct build artifacts can be found.