Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove osversion usage in computestorage APIs #1463

Merged
merged 2 commits into from
Aug 10, 2022

Conversation

dcantah
Copy link
Contributor

@dcantah dcantah commented Jul 20, 2022

This change reverts commit aaf5db9.

We'd added a change to FormatWritableLayerVhd to help the caller work around
a breaking change in the OS, but this would actually cause a breaking change
in our wrapper of it if the caller was already working around the issue
themselves. To avoid this scenario, revert the commit that added the
"friendly" behavior.

This additionally adds a comment on our wrapper of HcsFormatWritableLayerVhd to describe that
it expects a different handle on anything above ws2019 and adds a comment above SetupBaseOSVolume
stating what build it's supported on.

@dcantah dcantah requested a review from a team as a code owner July 20, 2022 09:55
@dcantah
Copy link
Contributor Author

dcantah commented Jul 20, 2022

@TBBle I've went back and forth on this. As a user of the package, what do you think? Funnily enough this is not the only exposed API that calls osversion, ExpandSandboxSize bounces to wclayer.ExpandScratchSize which calls it to handle some edgecases on 19h1, albeit the work needed is much more than the breaking change with the disk handle described in this https://github.com/microsoft/hcsshim/blob/master/internal/wclayer/expandscratchsize.go#L35-L142

I don't want to end up in a spot where someone adding in a workaround in their code can break depending on whether they manifest in the future or not. e.g. someone adds in the file handle workaround for FormatWritableLayerVhd by checking the registry or RtlGetVersion etc. and then decide to manifest their binary later on causes us to enter that condition and try and AttachVirtualDisk using their file handle they passed us and their code breaks. While it is a tad annoying, maybe the best place to keep this API easily callable is to let the application worry about the extra logic on RS5.

I do still think we should manifest containerd however as it's likely far overdue, even though this is what spurred it.

@slonopotamus
Copy link
Contributor

slonopotamus commented Jul 20, 2022

Did you consider a different approach: just stop using GetVersion and instead obtain OS version info through more reliable sources?

GetVersion docs say that you should not use it at all:

GetVersion may be altered or unavailable for releases after Windows 8.1. Instead, use the Version Helper functions.

@dcantah
Copy link
Contributor Author

dcantah commented Jul 20, 2022

Did you consider a different approach: just stop using GetVersion and instead obtain OS version info through more reliable sources?

GetVersion docs say that you should not use it at all:

GetVersion may be altered or unavailable for releases after Windows 8.1. Instead, use the Version Helper functions.

They still require the binary to be manifested to return correct info. If it were up to me I'd just swap osversion to checking the registry or RtlGetVersion, but they were both recommended against internally as manifesting is still the route. I didn't hear any shouts for a need to get off GetVersion however. Curious if that notion has shifted since the last I checked though. VerifyVersionInfo which is what those helpers call underneath is also "deprecated" since Windows 10 which is funny

@dcantah dcantah marked this pull request as draft July 23, 2022 22:45
@jterry75
Copy link
Contributor

Yuck lol

@dcantah
Copy link
Contributor Author

dcantah commented Jul 25, 2022

For real

@dcantah
Copy link
Contributor Author

dcantah commented Jul 25, 2022

@jterry75 We may be able to use the supported schema version here. It seems the minor gets incremented per official windows release, but all we really care about is 2.1 (RS5) or > 2.1 here. There may be an odd pre-release build in between rs5 and 19h1 where they made the breaking change but the schema version was still 2.1 (not sure when they increment it in development, asked), but I don't think that'd be a huge population to worry about

@jterry75
Copy link
Contributor

Interesting idea. Insider builds are not supported anyways so I think thats fine. I suppose at the end of the day is no different than using build numbers. We say build NNN means features XYZ. We could just as easily say Schema Y.N means features XYZ

@dcantah
Copy link
Contributor Author

dcantah commented Jul 26, 2022

Right, they have a doc detailing the increments in schema version for each version so it'd even be possible to adopt it for other checks if need be. They just rev the minor for every windows release. Although your parsing of cmd /c ver is also an option. I guess we could even skip manifesting containerd in that case, although we miss out on a cool icon for containerd 😆

@jterry75
Copy link
Contributor

jterry75 commented Jul 26, 2022

I guess we could even skip manifesting containerd in that case, although we miss out on a cool icon for containerd 😆

So funny

If we are ok calling cmd /c ver I think thats the easiest option, or we could have osversion do the trick for us, so basically ver = CallHCSOnceAndGetSchemaVersion(); return osversion.V... when 2.1 etc. So that none of the code really considers it any different. Using the schema is sorta hidden. Either works for me.

We can figure out a new way to get a cool icon..

@dcantah
Copy link
Contributor Author

dcantah commented Jul 26, 2022

The only thing cmd /c ver brings to mind is the classic text parsing dilemma, if the format changes our parsing will implode. Don't picture that happening anytime soon, but something to note. Also.. I kind of like the idea of the cmd /c ver as a fallback instead of replacement. Given the query is only done once, if we GetVersion() and we get back the bogus win8 build number, we can fallback to cmd.

@jterry75
Copy link
Contributor

Fallback works for me. Then if someone wants to manifest they get the perf for free. I like it

@dcantah
Copy link
Contributor Author

dcantah commented Jul 27, 2022

@jterry75 Before making an osversion change, we're gonna try to fix the problem in here first. Given 0.9.x didn't have any friendly logic and just passed whatever handle you give it, if anyone WAS doing the workaround themselves, if they updated to v0.10.x and we try and be friendly inside that would also break them.

I think we're going to try and stick with this change, e.g. just revert any osversion usage, and then make another wrapper for this function for v0.10.0 with the "friendly" behavior and ideally deprecate this one. We're trying to get the HCS teams blessing if just checking the schemas sufficient

@dcantah dcantah marked this pull request as ready for review July 29, 2022 22:30
@dcantah
Copy link
Contributor Author

dcantah commented Jul 29, 2022

@kevpar Undrafted this. Lets get back to the v0.9.x behavior first and then we can deprecate/recommend a new call that handles things for the caller

@dcantah
Copy link
Contributor Author

dcantah commented Aug 1, 2022

@kevpar Reverted the commit instead of doing it manually and added an additional commit to document things a bit, PTAL

@helsaawy helsaawy self-assigned this Aug 3, 2022
… RS5 (microsoft#1204)"

This reverts commit aaf5db9.

We'd added a change to FormatWritableLayerVhd to help the caller work around
a breaking change in the OS, but this would actually cause a breaking change
in our wrapper of it if the caller was already working around the issue
themselves. To avoid this scenario, revert the commit that added the
"friendly" behavior.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
@TBBle
Copy link
Contributor

TBBle commented Aug 4, 2022

I haven't worked through the codebase, but since the early comment mentioned ExpandScratchSize, I saw the check there is only covering two releases (1903 and 1909), which are now out-of-support (even for Windows 10 Enterprise, as of May), and there was never a Windows Server LTSC build in that range.

So maybe that check/workaround can be dropped too. I think it'd be an interesting result if we can get rid of all the osversion.Build calls outside the testsuite, given that Windows Server 2016 has left mainstream support, so everything likely to use new hcsshim builds from whatever the next release branch is called is RS5 or later.

@dcantah
Copy link
Contributor Author

dcantah commented Aug 5, 2022

@TBBle I'm more curious if the OS fix that's mentioned that is being worked around was ever fixed 🤣. We can probably remove that as well though yes (not in this PR)

Add a comment on our wrapper of HcsFormatWritableLayerVhd to describe that
it expects a different handle on anything above ws2019. Additionally add
a comment above SetupBaseOSVolume stating what build it's supported on
and that the application must be manifested

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
@TBBle
Copy link
Contributor

TBBle commented Aug 6, 2022

As an experiment I threw the current PR into the containerd test-case from #1375, and it looks good.

There's some unrelated failures, but unit and integration tests passed on both WS 2019 and WS 2022. (Both failed on TestContainerdRestart in the CRI integration test suite, which is mildly concerning, but I can't see how that would be related to what this PR is doing. Last time I looked, containerd's CRI integration test suite was manifested, so it wasn't affected by #1375 anyway.)

@dcantah
Copy link
Contributor Author

dcantah commented Aug 6, 2022

That test has been on and off failing for a bit, we'll fix one cause and out pops another. I think I saw it fail on main there a couple days ago so I don't think it's from this either, but we should take another look as it seems to be causing a mess again. I'm glad the pkg/os test that uses formatwritablelayer is passing again though it sounds like

@dcantah dcantah merged commit 2a9d2d9 into microsoft:main Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants