Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Motivation docs for graph #5741

Merged
merged 7 commits into from
Oct 7, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 162 additions & 9 deletions documentation/specs/static-graph.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# Static Graph

- [Static Graph](#static-graph)
- [Overview](#overview)
- [Motivations](#motivations)
- [What is static graph for?](#what-is-static-graph-for)
- [Weakness of the old model: project-level scheduling](#weakness-of-the-old-model-project-level-scheduling)
- [Weakness of the old model: incrementality](#weakness-of-the-old-model-incrementality)
- [Weakness of the old model: caching and distributability](#weakness-of-the-old-model-caching-and-distributability)
- [What is static graph?](#what-is-static-graph)
- [Design documentation](#design-documentation)
- [Design goals](#design-goals)
- [Project Graph](#project-graph)
- [Build dimensions](#build-dimensions)
- [Multitargeting](#multitargeting)
Expand All @@ -19,12 +26,83 @@
- [Detours](#detours)
- [Isolation requirement](#isolation-requirement)
- [Tool servers](#tool-servers)
- [Examples](#examples)

# Static Graph
## What is static graph for?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any visual recreation of the graph available today (like a dgml diagram or something)? Might be an interesting future project as we invest more here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I believe only internally. @cdmihai added code that can spit out a DOT file--see for example

var dot = projectGraph.ToDot();

Which produces

graph dot

As you can see, this quickly gets complex!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, we'd probably need to invest on making that more navigable and readable if we were to expose that. Not sure if there are changes customers can do based on the graph that are worth doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if this becomes the default case I think we'll definitely need to figure out a nice visualization for it. We could do that today, and folks have done so in the past (https://www.visualstudiogeeks.com/blog/msbuild/visual%20studio/msbuild-dependency-visualizer for example), but it hasn't seemed super important.

Copy link
Contributor

@cdmihai cdmihai Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binlog viewer also prints out the project dependency graph :D. But it gets pretty hairy beyond 5-10 projects. However, it's super useful to understand the p2p protocol :)

image


As a repo gets bigger and more complex, weaknesses in MSBuild's scheduling and incrementality models become more apparent. MSBuild's static graph features are intended to ameliorate these weaknesses while remaining as compatible as possible with existing projects and SDKs.

MSBuild projects can refer to other projects by using the `MSBuild` task to execute targets in another project and return values. In `Microsoft.Common.targets`, `ProjectReference` items are transformed into `MSBuild` task executions in order to provide a user-friendly interface: "reference the output of these projects".

### Weakness of the old model: project-level scheduling

Because references to other projects aren't known until a target in the referencing project calls the `MSBuild` task, the MSBuild engine cannot start working on building referenced projects until the referencing project yields. For example, if project `A` depended on `B`, `C`, and `D` and was being built with more-than-3 way parallelism, an ideal build would run `B`, `C`, and `D` in parallel with the parts of `A` that could execute before the references were available.

Today, the order of operations of this build are:

1. `A` completes evaluation and starts building, doing isolated work until it gets to `ResolveProjectReferences`.
1. In parallel, `B`, `C`, and `D` run the requested targets.
1. `A` resumes building and completes.

With graph-aware scheduling, this becomes:

1. `A`, `B`, `C`, and `D` evaluate in parallel.
1. `B`, `C`, and `D` build to completion in parallel.
1. `A` builds, and instantly gets cached results for the `MSBuild` task calls in `ResolveProjectReferences`

### Weakness of the old model: incrementality

[Incremental build](https://docs.microsoft.com/visualstudio/msbuild/incremental-builds) (that is, "redo only the parts of the build that would produce different outputs compared to the last build") is the most powerful tool to reduce build times and increase developer inner-loop speed.

MSBuild supports incremental builds by allowing a target to be skipped if the target's outputs are up to date with its inputs. This allows tools like the compiler to be skipped when possible. But since the incrementality is at the target level, MSBuild must fully evaluate the project and walk through all targets, running those that are out of date or that don't specify inputs and outputs.

Consider a simple solution with a library and an application that depends on the library. Suppose you build, then make a change in the application's source code, then build again.

The second build will:

1. Build the library project, skipping all targets that define inputs and outputs.
1. Build the application project.

But using higher-level knowledge, we can see a more-optimal build:

1. Skip everything involving the library project, because _none_ of its inputs have changed.
1. Build only the application project.

Visual Studio offers a ["fast up-to-date check"](https://github.com/dotnet/project-system/blob/cd275918ef9f181f6efab96715a91db7aabec832/docs/up-to-date-check.md) system that gets closer to the latter, but MSBuild itself does not.

### Weakness of the old model: caching and distributability

For very large builds, including many Microsoft products, the fact that MSBuild can build in parallel only on a single machine is a major impediment, even if incrementality is addressed.

Ideally, a build could span multiple computers, and each could use results generated on another machine as inputs to its own build projects. In addition, if all of a project's inputs remain unchanged, the system would ideally reuse the outputs of the project, even if they were built long ago on another computer.

Microsoft has an internal build system, [CloudBuild](https://www.microsoft.com/research/publication/cloudbuild-microsofts-distributed-and-caching-build-service/), that supports this and has proven that it is effective, but is heuristic-based and requires maintenance.

MSBuild static graph features make it easier to implement a system like CloudBuild by building operations like graph construction and output caching into MSBuild itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anyone outside of CloudBuild done this and would we have guidance for someone who tried to build one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on issues, I know a couple of folks have played with it but I don't know if they got anywhere. We don't really have guidance--if there's external interest I think we could provide some.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not worth it as it'd likely only be the rarest of experts who try this or 1ES.


## What is static graph?

MSBuild's static graph extends the MSBuild engine and APIs with new functionality to improve on these weaknesses:

- The ability to [construct a directed acyclic graph of MSBuild projects](#project-graph) given an entry point (solution or project).
- The ability to consider that graph when scheduling projects for build.
- The ability to cache MSBuild's internal build results (metadata about outputs, not the outputs themselves) across build invocations.
- The ability to [enforce restrictions on builds](#isolated-builds) to ensure that the graph is correct and complete.

Static graph functionality can be used in three ways:
rainersigwald marked this conversation as resolved.
Show resolved Hide resolved

- On the command line with `-graph` (and equivalent API).
- This gets the scheduling improvements for well-specified projects, but allows underspecified projects to complete without error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the -graph command error if it cannot produce an acyclic graph?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, technically yes?

$ dotnet msbuild -graph
Microsoft (R) Build Engine version 16.8.0-preview-20451-02+51a1071f8 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

Stack overflow.

😰 #3757

- On the command line with `-graph -isolate` (and equivalent API).
- This gets the scheduling improvements and also enforces that the graph is correct and complete. In this mode, MSBuild will produce an error if there is an `MSBuild` task invocation that was not known to the graph ahead of time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to explain a bit what "graph is correct" means.

- As part of a higher-order build system that uses [single project isolated builds](#single-project-isolated-builds) to provide caching and/or distribution on top of the built-in functionality. The only known implementation of this system is Microsoft-internal currently.

"Correct and complete" here means that the static graph can be used to accurately predict all targets that need to be built for all projects in the graph, and all of the references between projects. This is required for the higher-order build system scenario, because an unknown reference couldn't be satisfied at runtime (as it is in regular MSBuild and `-graph` with no `-isolate` scenarios).

## Overview
## Design documentation

### Design goals

### Motivations
- Stock projects can build with "project-level build" and if clean onboard to MS internal build engines with cache/distribution
- Stock projects will be "project-level build" clean.
- Add determinism to MSBuild w.r.t. project dependencies. Today MSBuild discovers projects just in time, as it finds MSBuild tasks. This means there’s no guarantee that the same graph is produced two executions in a row, other than hopefully sane project files. With the static graph, you’d know the shape of the build graph before the build starts.
Expand Down Expand Up @@ -76,10 +154,10 @@ Multitargeting supporting SDKs MUST implement the following properties and seman
- Node edges
- When project A references multitargeting project B, and B is identified as an outer build, the graph node for project A will reference both the outer build of B, and all the inner builds of B. The edges to the inner builds are speculative, as at build time only one inner build gets referenced. However, the graph cannot know at evaluation time which inner build will get chosen.
- When multitargeting project B is a root, then the outer build node for B will reference the inner builds of B.
- For multitargeting projects, the `ProjectReference` item gets applied only to inner builds. An outer build cannot have its own distinct `ProjectReference`s, it is the inner builds that reference other project files, not the outer build. This constraint might get relaxed in the future via additional configuration, to allow outer build specific references.
- For multitargeting projects, the `ProjectReference` item gets applied only to inner builds. An outer build cannot have its own distinct `ProjectReference`s, it is the inner builds that reference other project files, not the outer build. This constraint might get relaxed in the future via additional configuration, to allow outer build specific references.

These specific rules represent the minimal rules required to represent multitargeting in `Microsoft.Net.Sdk`. As we adopt SDKs whose multitargeting complexity that cannot be expressed with the above rules, we'll extend the rules.
For example, `InnerBuildProperty` could become `InnerBuildProperties` for SDKs where there's multiple multitargeting global properties.
For example, `InnerBuildProperty` could become `InnerBuildProperties` for SDKs where there's multiple multitargeting global properties.

For example, here is a trimmed down `Microsoft.Net.Sdk` multitargeting project:
```xml
Expand Down Expand Up @@ -315,7 +393,7 @@ These incremental builds can even be extended to multiple projects by keeping a
<!-- workflow -->
Single project builds can be achieved by providing MSBuild with input and output cache files.

The input cache files contain the cached results of all of the current project's references. This way, when the current project executes, it will naturally build its references via [MSBuild task](https://docs.microsoft.com/en-us/visualstudio/msbuild/msbuild-task) calls. The engine, instead of executing these tasks, will serve them from the provided input caches.
The input cache files contain the cached results of all of the current project's references. This way, when the current project executes, it will naturally build its references via [MSBuild task](https://docs.microsoft.com/en-us/visualstudio/msbuild/msbuild-task) calls. The engine, instead of executing these tasks, will serve them from the provided input caches.

The output cache file tells MSBuild where it should serialize the results of the current project. This output cache would become an input cache for all other projects that depend on the current project.
The output cache file can be ommited in which case the build would just reuse prior results but not write out any new results. This could be useful when one wants to replay a build from previous caches.
Expand Down Expand Up @@ -345,7 +423,7 @@ Output cache file constraints:

#### APIs
Caches are provided via [BuildParameters](https://github.com/Microsoft/msbuild/blob/2d4dc592a638b809944af10ad1e48e7169e40808/src/Build/BackEnd/BuildManager/BuildParameters.cs#L746-L764). They are applied in `BuildManager.BeginBuild`
#### Command line
#### Command line
Caches are provided to MSBuild.exe via the multi value `/inputResultsCaches` and the single value `/outputResultsCache`.

## I/O Tracking
Expand Down Expand Up @@ -381,3 +459,78 @@ To support this scenario, a new MSBuild Task API could be introduced which allow
Similarly for a theoretical server mode for MSBuild, MSBuild would need to report its own I/O rather than the higher-order build engine detouring the process externally. For example, if the higher-order build engine connected to an existing running MSBuild process to make build requests, it could not detour that process and so MSBuild would need to report all I/O done as part of a particular build request.

**OPEN ISSUE:** As described above in an open issue, tool servers are the only scenario which would not be supportable by just externally detouring the MSBuild process. The amount of investment required to enable tool servers is quite high and spans across multiple codebases: MSBuild needs to detour itself, MSBuild need to expose a new Tasks API, the `Csc` task needs to opt into that API, and the higher-order build engine needs to opt-in to MSBuild reporting its own I/O, as well as detecting that the feature is supported in the version of MSBuild it's using. Tool servers may add substantial performance gain, but the investment is also substantial.

## Examples

To illustrate the difference between `-graph` and `-graph -isolate`, consider these two projects, which are minimal except for a new target in the referenced project that is consumed in the referencing project.
rainersigwald marked this conversation as resolved.
Show resolved Hide resolved

`Referenced\Referenced.csproj`:

```xml
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp3.1</TargetFramework>
<UnusualOutput>Configuration\Unusual.txt</UnusualOutput>
</PropertyGroup>

<Target Name="UnusualThing" Returns="$(UnusualOutput)" />
</Project>
```

`Referencing\Referencing.csproj`:

```xml
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp3.1</TargetFramework>
</PropertyGroup>

<ItemGroup>
<ProjectReference Include="..\Referenced\Referenced.csproj" />
</ItemGroup>

<Target Name="GetUnusualThing" BeforeTargets="BeforeBuild">
<MSBuild Projects="..\Referenced\Referenced.csproj"
Targets="UnusualThing">
<Output TaskParameter="TargetOutputs"
ItemName="Content" />
</MSBuild>
</Target>
</Project>
```

This project can successfully build with `-graph`

```sh-session
$ dotnet msbuild Referencing\Referencing.csproj -graph
"Static graph loaded in 0.253 seconds: 2 nodes, 1 edges"
Referenced -> S:\Referenced\bin\Debug\netcoreapp3.1\Referenced.dll
Referencing -> S:\Referencing\bin\Debug\netcoreapp3.1\Referencing.dll
```

But fails with `-graph -isolate`

```sh-session
$ dotnet msbuild Referencing\Referencing.csproj -graph -isolate
"Static graph loaded in 0.255 seconds: 2 nodes, 1 edges"
Referenced -> S:\Referenced\bin\Debug\netcoreapp3.1\Referenced.dll
S:\Referencing\Referencing.csproj(12,5): error : MSB4252: Project "S:\Referencing\Referencing.csproj" with global properties
S:\Referencing\Referencing.csproj(12,5): error : (IsGraphBuild=true)
S:\Referencing\Referencing.csproj(12,5): error : is building project "S:\Referenced\Referenced.csproj" with global properties
S:\Referencing\Referencing.csproj(12,5): error : (IsGraphBuild=true)
S:\Referencing\Referencing.csproj(12,5): error : with the (UnusualThing) target(s) but the build result for the built project is not in the engine cache. In isolated builds this could mean one of the following:
S:\Referencing\Referencing.csproj(12,5): error : - the reference was called with a target which is not specified in the ProjectReferenceTargets item in project "S:\Referencing\Referencing.csproj"
S:\Referencing\Referencing.csproj(12,5): error : - the reference was called with global properties that do not match the static graph inferred nodes
S:\Referencing\Referencing.csproj(12,5): error : - the reference was not explicitly specified as a ProjectReference item in project "S:\Referencing\Referencing.csproj"
S:\Referencing\Referencing.csproj(12,5): error :
```

This part of the error is the problem here:

> the reference was called with a target which is not specified in the ProjectReferenceTargets item in project "S:\Referencing\Referencing.csproj"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think you need to elaborate a bit more on why this is an error in the isolated mode. Particularly because I'm honestly unsure of what the exact problem is here 😄

Copy link
Contributor

@cdmihai cdmihai Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR: /isolate ensures the graph build does at least the work that vanilla msbuild does. But it's pretty complicated to use and probably only suitable for sdk writers.

  • it brutally checks the correctness of "what targets should a graph node be called with?". In top-down just-in-time vanilla msbuild a project gets built at the time of discovery. It gets called with whatever arbitrary target values end up in <MSBuild Targets=$(some_targets). During a graph build the runtime graph gets statically inferred and built bottom up, but without doing symbolic execution on the xml we have no idea what targets to call a node with. We solved this by adding a way to statically declare what targets a project calls on its references via the ProjectReferenceTargets protocol, and then we run a data flow over the graph to propagate them. But how do we know the declarations are correct? At minimum the graph build should build at least what vanilla msbuild builds. So one can use /isolate to ensure that the called targets are a superset of the targets run by vanilla msbuild. This is most useful for SDK writers, to ensure that their p2p target calling patterns are inferred by static graph. Not so much useful for end users unless they customize their build.
  • correctness wise, even if /isolate is violated, the build would most likely still end up correct. If A calls B with Foo, and Foo is undeclared, then Foo won't get run when the graph build builds B, but will get run regardless when the graph build A, and then A calls into B with Foo. However, if you have a distributed / cached build, it would be nice to capture and cache Foo's potentially expensive CPU work and IO side effects as part of B's scheduled run and cache entry, not as part of A's run. The higher order build engine might even complain if Foo touches the outputs of B, or the outputs of B's references. The fact that Foo's side effects get captured in A's cache entry rather than B's cache entry may or may not be a problem, depending on what one uses the caches for. Just binplacing cached results is fine, doing analyses to infer file dataflow might not be fine.
  • MSBuild needs a way to build a project "in isolation", without following references. This is useful when another tool is orchestrating the build. Both VS and Quickbuild build their own msbuild graph and then tell msbuild to build each project in isolation, without following references. To do this both of them rely on the sdks respecting the BuildProjectReferences=false convention. /isolate could be used to check that BuildProjectReferences is actually respected.
  • it enables a potentially interesting way of doing a graph build. If we can correctly infer all the targets a project gets called with, we could visit each node exactly once and call it with all the inferred targets. Then we could serialize those target results and feed them to all the referencing projects. This would enable each project only doing its own work, without reaching out and doing any work in other projects.


This is unacceptable in an isolated build because it means that the cached outputs of `Referenced.csproj` will be incomplete: they won't have the results of the `GetUnusualThing` target, because it's nonstandandard (and thus not one of the "well understood to be called on `ProjectReference`s targets that are handled by default).

TODO: write docs for SDK authors/build engineers on how to teach the graph about this sort of thing.