Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
249 changes: 249 additions & 0 deletions documentation/specs/multithreading/taskhost-IBuildEngine-callbacks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
# Design Specification: TaskHost IBuildEngine Callback Support

**Status:** Draft | **Related Issue:** #12863

---

## 1. Problem Statement

The MSBuild TaskHost (`OutOfProcTaskHostNode`) implements `IBuildEngine10` but lacks support for several callbacks. TaskHost is used when:
1. **`MSBUILDFORCEALLTASKSOUTOFPROC=1`** with `-mt` mode - forces non-thread-safe tasks out-of-proc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these separate cases?

Suggested change
1. **`MSBUILDFORCEALLTASKSOUTOFPROC=1`** with `-mt` mode - forces non-thread-safe tasks out-of-proc
1. **`MSBUILDFORCEALLTASKSOUTOFPROC=1`**
2. `-mt` mode - forces non-thread-safe tasks out-of-proc

2. **Explicit `TaskHostFactory`** in `<UsingTask>` declarations

If tasks in TaskHost call unsupported callbacks, the build fails with MSB5022 or `NotImplementedException`.

**Note:** This is an infrequent scenario - a compatibility layer for multithreaded MSBuild, not a hot path.

### Unsupported Callbacks

| Callback | Interface | Current Behavior |
|----------|-----------|------------------|
| `IsRunningMultipleNodes` | IBuildEngine2 | Logs MSB5022, returns `false` |
| `BuildProjectFile` (4 params) | IBuildEngine | Logs MSB5022, returns `false` |
| `BuildProjectFile` (5 params) | IBuildEngine2 | Logs MSB5022, returns `false` |
| `BuildProjectFilesInParallel` (7 params) | IBuildEngine2 | Logs MSB5022, returns `false` |
| `BuildProjectFilesInParallel` (6 params) | IBuildEngine3 | Logs MSB5022, returns `false` |
| `Yield` / `Reacquire` | IBuildEngine3 | Silent no-op |
| `RequestCores` / `ReleaseCores` | IBuildEngine9 | Throws `NotImplementedException` |

**Evidence:** src/MSBuild/OutOfProcTaskHostNode.cs lines 270-405

---

## 2. Goals

1. **Full IBuildEngine support in TaskHost** - Tasks work identically whether in-proc or in TaskHost
2. **Backward compatibility** - Existing behavior unchanged for tasks that don't use callbacks
3. **Acceptable performance** - IPC overhead tolerable for typical callback patterns
4. **Support multithreaded builds** - Unblock `-mt` for real-world projects

**Non-Goal:** CLR2/net35 `MSBuildTaskHost.exe` support (never had callback support)

---

## 3. Architecture

### Current Communication Flow

```text
PARENT MSBuild TASKHOST Process
┌─────────────┐ ┌───────────────────────────┐
│ TaskHostTask│──TaskHostConfiguration─▶│ OutOfProcTaskHostNode │
│ │ │ └─_taskRunnerThread │
│ │◀──LogMessagePacket──────│ └─Task.Execute() │
│ │◀──TaskHostTaskComplete──│ │
└─────────────┘ └───────────────────────────┘
```

**Key files:**

- src/Build/Instance/TaskFactories/TaskHostTask.cs - Parent side
- src/MSBuild/OutOfProcTaskHostNode.cs - TaskHost side

### Proposed: Bidirectional Callback Forwarding

```text
PARENT MSBuild TASKHOST Process
┌─────────────┐ ┌───────────────────────────┐
│ TaskHostTask│──TaskHostConfiguration─▶│ OutOfProcTaskHostNode │
│ │ │ └─_taskRunnerThread │
│ │◀──LogMessagePacket──────│ │ │
│ │◀─CallbackRequest────────│ ├─task.Execute() │
│ │ │ │ └─BuildProject() │
│ │──CallbackResponse──────▶│ │ [blocks] │
│ │ │ │ │
│ │◀──TaskHostTaskComplete──│ └─[unblocks] │
└─────────────┘ └───────────────────────────┘
```

---

## 4. Design

### 4.1 Threading Model

**Critical constraint:** TaskHost has two threads:

- **Main thread** (`Run()`) - handles IPC via `WaitHandle.WaitAny()` loop
- **Task thread** (`_taskRunnerThread`) - executes `task.Execute()`

Callbacks are invoked from the task thread but responses arrive on the main thread.

**Solution:** Use `TaskCompletionSource<INodePacket>` per request:

1. Task thread creates request, registers TCS in `_pendingRequests[requestId]`
2. Task thread sends packet, calls `tcs.Task.Wait()`
3. Main thread receives response, calls `tcs.SetResult(packet)` to unblock task thread

### 4.2 New Packet Types

| Packet | Direction | Purpose |
|--------|-----------|---------|
| `TaskHostBuildRequest` | TaskHost → Parent | BuildProjectFile* calls |
| `TaskHostBuildResponse` | Parent → TaskHost | Build results + outputs |
| `TaskHostResourceRequest` | TaskHost → Parent | RequestCores/ReleaseCores |
| `TaskHostResourceResponse` | Parent → TaskHost | Cores granted |
| `TaskHostQueryRequest` | TaskHost → Parent | IsRunningMultipleNodes |
| `TaskHostQueryResponse` | Parent → TaskHost | Query result |
| `TaskHostYieldRequest` | TaskHost → Parent | Yield/Reacquire |
| `TaskHostYieldResponse` | Parent → TaskHost | Acknowledgment |

**Location:** `src/MSBuild/` (linked into Microsoft.Build.csproj). NOT in `src/Shared/` since MSBuildTaskHost (CLR2) is out of scope.

### 4.3 INodePacket.cs Changes

```csharp
public enum NodePacketType : byte
{
// ... existing ...
TaskHostBuildRequest = 0x20,
TaskHostBuildResponse = 0x21,
TaskHostResourceRequest = 0x22,
TaskHostResourceResponse = 0x23,
TaskHostQueryRequest = 0x24,
TaskHostQueryResponse = 0x25,
TaskHostYieldRequest = 0x26,
TaskHostYieldResponse = 0x27,
}
```

### 4.4 Key Implementation Points

**OutOfProcTaskHostNode (TaskHost side):**

- Add `ConcurrentDictionary<int, TaskCompletionSource<INodePacket>> _pendingRequests`
- Add `SendRequestAndWaitForResponse<TRequest, TResponse>()` helper
- Replace stub implementations with forwarding calls
- Add response handling in `HandlePacket()`

**TaskHostTask (Parent side):**

- Register handlers for new request packet types
- Add `HandleBuildRequest()`, `HandleResourceRequest()`, etc.
- Forward to real `IBuildEngine` and send response

---

## 5. ITaskItem Serialization

`TaskHostBuildResponse.TargetOutputsPerProject` contains `IDictionary<string, ITaskItem[]>` per project.

**Existing pattern:** `TaskParameter` class handles `ITaskItem` serialization for `TaskHostTaskComplete`. Use same approach.

**Reference:** src/Shared/TaskParameter.cs

---

## 6. Phased Rollout

| Phase | Scope | Risk | Effort |
|-------|-------|------|--------|
| 1 | `IsRunningMultipleNodes` | Low | 2 days |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unsure if this is worth forwarding the request along vs hardcoding "yeah if you're in this mode assume parallelism".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is -mt /m:1 (or just /mt for msbuild.exe) the only way to hit this condition? What is the practical impact on the Tasks we have that use IsRunningMultipleNodes? Do we know if any 1P Tasks care about this condition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's /m:1 + a taskhost for any reason (mt + unenlightened task, env var that forces everything to a task host, explicit taskhost request in usingtask).

I can't imagine that it's widely used. I see one use in our repo in the task version of the MSBuild task that changes how StopOnFirstFailure works, which is itself a silly thing to do . . .

| 2 | `RequestCores`/`ReleaseCores` | Medium | 3 days |
| 3 | `Yield`/`Reacquire` | Medium | 3 days |
| 4 | `BuildProjectFile*` | High | 5-7 days |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BuildProjectFile is the most critical one, for XAML, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimaTian is that the one you hit in roslyn?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current take is that this is a niche compat layer we should invest minimum amount possible into.

  • IsRunningMultipleNodes, RequestCores/ReleaseCores may be used by tasks for some decisions that may change perf characteristics of such tasks, if the customers care about that case they should move to make the task multithreadable. just ignore

  • Yield,Reacquire this is a very "Worker Node" concept that is weird with taskhosts, we would need to host multiple taskhosts inside of 1 taskhost process and figure out how to communicate with which, and for what benefit? I think we should just ignore it forever. Same as before, if customers really want Yielding they can migrate.

  • if we don't implement the multiple taskhosts inside of 1 taskhost process from ^ I think BuildProjectFile needs to be implemented in a way that ignores max node count and just spawns an inproc node and waits until that finishes. Again if customers don't like this they should enlighten.

  • or the inproc node might "Yield" and that means it'd need to connect to a fresh taskhost where it could push non-mt-compatible tasks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rainersigwald does not this callback require (when called in the working node) to free the worker node for other projects to build, similar to yielding scenarios? Otherwise, I guess, we might get into the situation when we exhausted all msbuild nodes - all nodes would be blocked waiting for the resolution of the callback. So, the same might happen with task host if we implement just a blocking behavior - and much quicker because the thread nodes tied to the specific task host. I guess for proper implementation we will need to implement freeing the task host for more tasks to come to it from another projects, and a way to save and recover the state to be able to continue the execution. This sounds quite like a challenge.

Copy link
Member

@rainersigwald rainersigwald Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about the short-term impact of not supporting Yield/Reacquire: they make running tool processes fairly efficient by yielding the node for other MSBuild work while running the tool "in the background".

With BuildProjectFiles: we must free the MSBuild-layer part of the node to do additional work, including running other tasks--if we don't, we can hit the exhaustion case (consider an -m:1 build with one xaml project: do some stuff on the entry project, build the "inner" one including running tasks, come back to finish).

In normal execution we would handle that by leaving a (System.Threading.)Task dangling and yielding (in the System.Threading sense) to do the other work in-proc (so you'd have both the task that called BuildProjectFile and the tasks launched in the called project running concurrently).

I think we have to support that to have a good hope of overall MT success here, since lots of projects use XAML tasks that ship in .NET Framework and can't be updated to the new model.


**Rationale:** Phase 1 validates the forwarding infrastructure with minimal risk. Phase 4 is highest risk due to complex `ITaskItem[]` serialization and recursive build scenarios.

---

## 7. Open Questions for Review

### Q1: Yield semantics in TaskHost

Current no-op may be intentional - TaskHost is single-threaded per process. Options:

- A) Forward to parent and actually yield (allows scheduler to run other work)
- B) Keep as no-op (current behavior, safest)

**Recommendation:** (B) initially - Yield/Reacquire are rarely used by tasks, and current no-op behavior has shipped. Revisit if real-world need arises.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, though I think the most expensive tasks do tend to yield. Definitely something that can wait.


### Q2: Error handling for parent crash during callback

If parent dies while TaskHost awaits response:

- A) Timeout and fail task
- B) Detect pipe closure immediately and fail
- C) Both

**Recommendation:** (C) - `_nodeEndpoint.LinkStatus` check + timeout

---

## 8. Risks

| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Deadlock in callback wait | Low | High | Timeouts, no lock held during wait, main thread never waits on task thread |
| IPC serialization bugs | Medium | Medium | Packet round-trip unit tests |

**Note:** No "breaking existing behavior" risk - callbacks currently fail/throw, so any implementation is an improvement.

---

## 9. Testing Strategy

### Unit Tests

- Packet serialization round-trip
- Request-response correlation
- Timeout handling
- Cancellation during callback

### Integration Tests

- End-to-end `-mt` build with callback-using task
- TaskHost reuse across multiple tasks
- Recursive `BuildProjectFile` scenarios

### Stress Tests

- Many concurrent callbacks
- Large `ITaskItem[]` outputs

---

## 10. File Change Summary

| File | Change |
|------|--------|
| `src/Shared/INodePacket.cs` | Add enum values |
| `src/MSBuild/TaskHostBuildRequest.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostBuildResponse.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostResourceRequest.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostResourceResponse.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostQueryRequest.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostQueryResponse.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostYieldRequest.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/TaskHostYieldResponse.cs` | New (link to Microsoft.Build.csproj) |
| `src/MSBuild/OutOfProcTaskHostNode.cs` | Implement forwarding |
| `src/Build/Instance/TaskFactories/TaskHostTask.cs` | Handle requests |

---

## Appendix: References

- Current stub implementations: src/MSBuild/OutOfProcTaskHostNode.cs lines 270-405
- Existing packet serialization: src/Shared/TaskParameter.cs
- TaskHost message loop: src/MSBuild/OutOfProcTaskHostNode.cs lines 650-710
- Parent message loop: src/Build/Instance/TaskFactories/TaskHostTask.cs lines 270-320
Loading