-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Motivation docs for graph #5741
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really happy to see this doc getting updated! Left some comments on a few parts.
documentation/specs/static-graph.md
Outdated
|
||
The second build will: | ||
|
||
1. Build the library project, skipping all well-authored targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it would help to have a definition of "well-authored targets" in the doc. Not sure how generally well understood that concept is and it's pretty critical to this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is critical to old features too--I just mean "defines inputs/outputs". Fixing.
documentation/specs/static-graph.md
Outdated
|
||
Microsoft has an internal build system, [CloudBuild](https://www.microsoft.com/research/publication/cloudbuild-microsofts-distributed-and-caching-build-service/), that supports this and has proven that it is effective, but is heuristic-based and requires maintenance. | ||
|
||
MSBuild static graph features make it easier to implement a system like CloudBuild by building required operations into MSBuild itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what "required operations" means here.
Static graph functionality can be used in three ways: | ||
|
||
- On the command line with `-graph` (and equivalent API). | ||
- This gets the scheduling improvements for well-specified projects, but allows underspecified projects to complete without error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the -graph
command error if it cannot produce an acyclic graph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, technically yes?
$ dotnet msbuild -graph
Microsoft (R) Build Engine version 16.8.0-preview-20451-02+51a1071f8 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.
Stack overflow.
😰 #3757
- On the command line with `-graph` (and equivalent API). | ||
- This gets the scheduling improvements for well-specified projects, but allows underspecified projects to complete without error. | ||
- On the command line with `-graph -isolate` (and equivalent API). | ||
- This gets the scheduling improvements and also enforces that the graph is correct and complete. In this mode, MSBuild will produce an error if there is an `MSBuild` task invocation that was not known to the graph ahead of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May want to explain a bit what "graph is correct" means.
|
||
This part of the error is the problem here: | ||
|
||
> the reference was called with a target which is not specified in the ProjectReferenceTargets item in project "S:\Referencing\Referencing.csproj" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think you need to elaborate a bit more on why this is an error in the isolated mode. Particularly because I'm honestly unsure of what the exact problem is here 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR: /isolate ensures the graph build does at least the work that vanilla msbuild does. But it's pretty complicated to use and probably only suitable for sdk writers.
- it brutally checks the correctness of "what targets should a graph node be called with?". In top-down just-in-time vanilla msbuild a project gets built at the time of discovery. It gets called with whatever arbitrary target values end up in
<MSBuild Targets=$(some_targets)
. During a graph build the runtime graph gets statically inferred and built bottom up, but without doing symbolic execution on the xml we have no idea what targets to call a node with. We solved this by adding a way to statically declare what targets a project calls on its references via the ProjectReferenceTargets protocol, and then we run a data flow over the graph to propagate them. But how do we know the declarations are correct? At minimum the graph build should build at least what vanilla msbuild builds. So one can use /isolate to ensure that the called targets are a superset of the targets run by vanilla msbuild. This is most useful for SDK writers, to ensure that their p2p target calling patterns are inferred by static graph. Not so much useful for end users unless they customize their build. - correctness wise, even if /isolate is violated, the build would most likely still end up correct. If A calls B with Foo, and Foo is undeclared, then Foo won't get run when the graph build builds B, but will get run regardless when the graph build A, and then A calls into B with Foo. However, if you have a distributed / cached build, it would be nice to capture and cache Foo's potentially expensive CPU work and IO side effects as part of B's scheduled run and cache entry, not as part of A's run. The higher order build engine might even complain if Foo touches the outputs of B, or the outputs of B's references. The fact that Foo's side effects get captured in A's cache entry rather than B's cache entry may or may not be a problem, depending on what one uses the caches for. Just binplacing cached results is fine, doing analyses to infer file dataflow might not be fine.
- MSBuild needs a way to build a project "in isolation", without following references. This is useful when another tool is orchestrating the build. Both VS and Quickbuild build their own msbuild graph and then tell msbuild to build each project in isolation, without following references. To do this both of them rely on the sdks respecting the
BuildProjectReferences=false
convention. /isolate could be used to check thatBuildProjectReferences
is actually respected. - it enables a potentially interesting way of doing a graph build. If we can correctly infer all the targets a project gets called with, we could visit each node exactly once and call it with all the inferred targets. Then we could serialize those target results and feed them to all the referencing projects. This would enable each project only doing its own work, without reaching out and doing any work in other projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jaredpar!
documentation/specs/static-graph.md
Outdated
|
||
The second build will: | ||
|
||
1. Build the library project, skipping all well-authored targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is critical to old features too--I just mean "defines inputs/outputs". Fixing.
Static graph functionality can be used in three ways: | ||
|
||
- On the command line with `-graph` (and equivalent API). | ||
- This gets the scheduling improvements for well-specified projects, but allows underspecified projects to complete without error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, technically yes?
$ dotnet msbuild -graph
Microsoft (R) Build Engine version 16.8.0-preview-20451-02+51a1071f8 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.
Stack overflow.
😰 #3757
|
||
Microsoft has an internal build system, [CloudBuild](https://www.microsoft.com/research/publication/cloudbuild-microsofts-distributed-and-caching-build-service/), that supports this and has proven that it is effective, but is heuristic-based and requires maintenance. | ||
|
||
MSBuild static graph features make it easier to implement a system like CloudBuild by building operations like graph construction and output caching into MSBuild itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has anyone outside of CloudBuild done this and would we have guidance for someone who tried to build one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on issues, I know a couple of folks have played with it but I don't know if they got anywhere. We don't really have guidance--if there's external interest I think we could provide some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not worth it as it'd likely only be the rarest of experts who try this or 1ES.
|
||
# Static Graph | ||
## What is static graph for? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any visual recreation of the graph available today (like a dgml diagram or something)? Might be an interesting future project as we invest more here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I believe only internally. @cdmihai added code that can spit out a DOT file--see for example
var dot = projectGraph.ToDot(); |
Which produces
As you can see, this quickly gets complex!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, we'd probably need to invest on making that more navigable and readable if we were to expose that. Not sure if there are changes customers can do based on the graph that are worth doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if this becomes the default case I think we'll definitely need to figure out a nice visualization for it. We could do that today, and folks have done so in the past (https://www.visualstudiogeeks.com/blog/msbuild/visual%20studio/msbuild-dependency-visualizer for example), but it hasn't seemed super important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Marc Paine <marcpop@microsoft.com>
5579677
to
0f36d23
Compare
Several folks have asked for better documentation about what static graph is and why we think it's a good thing. That's . . . a very reasonable request!
cc @marcpopMSFT @KathleenDollard