Skip to content

Conversation

@jansvoboda11
Copy link
Contributor

The dependency scanner was initially using a fair amount of infrastructure provided by the clangTooling library. Over time, the needs for bespoke handling of command lines grew and the overlap with the tooling library kept shrinking. I don't think the library provides any value anymore.

I decided to remove the dependency and only reimplement the small bits required by the scanner.

This allowed for a nice simplification, where we no longer need to create temporary dummy FileManager instances (mis-named as DriverFileMgr in some parts) and SourceManager instances to attach to the DiagnosticsEngine. That code was copied from the tooling library to support DiagnosticConsumers that expect these to exist. The scanner uses a closed set of consumers and none need these objects to exist.

The motivation for this (hopefully NFC) patch are some new restrictions to how VFS's can be propagated in Clang that I'm working on.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jul 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 21, 2025

@llvm/pr-subscribers-clang

Author: Jan Svoboda (jansvoboda11)

Changes

The dependency scanner was initially using a fair amount of infrastructure provided by the clangTooling library. Over time, the needs for bespoke handling of command lines grew and the overlap with the tooling library kept shrinking. I don't think the library provides any value anymore.

I decided to remove the dependency and only reimplement the small bits required by the scanner.

This allowed for a nice simplification, where we no longer need to create temporary dummy FileManager instances (mis-named as DriverFileMgr in some parts) and SourceManager instances to attach to the DiagnosticsEngine. That code was copied from the tooling library to support DiagnosticConsumers that expect these to exist. The scanner uses a closed set of consumers and none need these objects to exist.

The motivation for this (hopefully NFC) patch are some new restrictions to how VFS's can be propagated in Clang that I'm working on.


Full diff: https://github.com/llvm/llvm-project/pull/149904.diff

1 Files Affected:

  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp (+32-37)
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 9bd85479d9810..3a80f3237b743 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -24,7 +24,6 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningService.h"
 #include "clang/Tooling/DependencyScanning/InProcessModuleCache.h"
 #include "clang/Tooling/DependencyScanning/ModuleDepCollector.h"
-#include "clang/Tooling/Tooling.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Error.h"
@@ -376,7 +375,7 @@ class ScanningDependencyDirectivesGetter : public DependencyDirectivesGetter {
 
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for the given compiler invocation.
-class DependencyScanningAction : public tooling::ToolAction {
+class DependencyScanningAction {
 public:
   DependencyScanningAction(
       DependencyScanningService &Service, StringRef WorkingDirectory,
@@ -388,9 +387,9 @@ class DependencyScanningAction : public tooling::ToolAction {
         DisableFree(DisableFree), ModuleName(ModuleName) {}
 
   bool runInvocation(std::shared_ptr<CompilerInvocation> Invocation,
-                     FileManager *DriverFileMgr,
+                     IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
                      std::shared_ptr<PCHContainerOperations> PCHContainerOps,
-                     DiagnosticConsumer *DiagConsumer) override {
+                     DiagnosticConsumer *DiagConsumer) {
     // Make a deep copy of the original Clang invocation.
     CompilerInvocation OriginalInvocation(*Invocation);
     // Restore the value of DisableFree, which may be modified by Tooling.
@@ -419,8 +418,8 @@ class DependencyScanningAction : public tooling::ToolAction {
     // Create the compiler's actual diagnostics engine.
     sanitizeDiagOpts(ScanInstance.getDiagnosticOpts());
     assert(!DiagConsumerFinished && "attempt to reuse finished consumer");
-    ScanInstance.createDiagnostics(DriverFileMgr->getVirtualFileSystem(),
-                                   DiagConsumer, /*ShouldOwnClient=*/false);
+    ScanInstance.createDiagnostics(*FS, DiagConsumer,
+                                   /*ShouldOwnClient=*/false);
     if (!ScanInstance.hasDiagnostics())
       return false;
 
@@ -441,9 +440,9 @@ class DependencyScanningAction : public tooling::ToolAction {
         any(Service.getOptimizeArgs() & ScanningOptimizations::VFS);
 
     // Support for virtual file system overlays.
-    auto FS = createVFSFromCompilerInvocation(
-        ScanInstance.getInvocation(), ScanInstance.getDiagnostics(),
-        DriverFileMgr->getVirtualFileSystemPtr());
+    FS = createVFSFromCompilerInvocation(ScanInstance.getInvocation(),
+                                         ScanInstance.getDiagnostics(),
+                                         std::move(FS));
 
     // Create a new FileManager to match the invocation's FileSystemOptions.
     auto *FileMgr = ScanInstance.createFileManager(FS);
@@ -554,9 +553,6 @@ class DependencyScanningAction : public tooling::ToolAction {
     if (Result)
       setLastCC1Arguments(std::move(OriginalInvocation));
 
-    // Propagate the statistics to the parent FileManager.
-    DriverFileMgr->AddStats(ScanInstance.getFileManager());
-
     return Result;
   }
 
@@ -669,15 +665,14 @@ llvm::Error DependencyScanningWorker::computeDependencies(
 }
 
 static bool forEachDriverJob(
-    ArrayRef<std::string> ArgStrs, DiagnosticsEngine &Diags, FileManager &FM,
+    ArrayRef<std::string> ArgStrs, DiagnosticsEngine &Diags,
+    IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
     llvm::function_ref<bool(const driver::Command &Cmd)> Callback) {
   SmallVector<const char *, 256> Argv;
   Argv.reserve(ArgStrs.size());
   for (const std::string &Arg : ArgStrs)
     Argv.push_back(Arg.c_str());
 
-  llvm::vfs::FileSystem *FS = &FM.getVirtualFileSystem();
-
   std::unique_ptr<driver::Driver> Driver = std::make_unique<driver::Driver>(
       Argv[0], llvm::sys::getDefaultTargetTriple(), Diags,
       "clang LLVM compiler", FS);
@@ -687,7 +682,8 @@ static bool forEachDriverJob(
   bool CLMode = driver::IsClangCL(
       driver::getDriverMode(Argv[0], ArrayRef(Argv).slice(1)));
 
-  if (llvm::Error E = driver::expandResponseFiles(Argv, CLMode, Alloc, FS)) {
+  if (llvm::Error E =
+          driver::expandResponseFiles(Argv, CLMode, Alloc, FS.get())) {
     Diags.Report(diag::err_drv_expand_response_file)
         << llvm::toString(std::move(E));
     return false;
@@ -710,17 +706,25 @@ static bool forEachDriverJob(
 
 static bool createAndRunToolInvocation(
     std::vector<std::string> CommandLine, DependencyScanningAction &Action,
-    FileManager &FM,
+    IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
     std::shared_ptr<clang::PCHContainerOperations> &PCHContainerOps,
     DiagnosticsEngine &Diags, DependencyConsumer &Consumer) {
 
   // Save executable path before providing CommandLine to ToolInvocation
   std::string Executable = CommandLine[0];
-  ToolInvocation Invocation(std::move(CommandLine), &Action, &FM,
-                            PCHContainerOps);
-  Invocation.setDiagnosticConsumer(Diags.getClient());
-  Invocation.setDiagnosticOptions(&Diags.getDiagnosticOptions());
-  if (!Invocation.run())
+
+  llvm::opt::ArgStringList Argv;
+  for (const std::string &Str : ArrayRef(CommandLine).drop_front())
+    Argv.push_back(Str.c_str());
+
+  auto Invocation = std::make_shared<CompilerInvocation>();
+  if (!CompilerInvocation::CreateFromArgs(*Invocation, Argv, Diags)) {
+    // FIXME: Should we just go on like cc1_main does?
+    return false;
+  }
+
+  if (!Action.runInvocation(std::move(Invocation), std::move(FS),
+                            PCHContainerOps, Diags.getClient()))
     return false;
 
   std::vector<std::string> Args = Action.takeLastCC1Arguments();
@@ -733,23 +737,14 @@ bool DependencyScanningWorker::scanDependencies(
     DependencyConsumer &Consumer, DependencyActionController &Controller,
     DiagnosticConsumer &DC, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
     std::optional<StringRef> ModuleName) {
-  auto FileMgr =
-      llvm::makeIntrusiveRefCnt<FileManager>(FileSystemOptions{}, FS);
-
   std::vector<const char *> CCommandLine(CommandLine.size(), nullptr);
   llvm::transform(CommandLine, CCommandLine.begin(),
                   [](const std::string &Str) { return Str.c_str(); });
   auto DiagOpts = CreateAndPopulateDiagOpts(CCommandLine);
   sanitizeDiagOpts(*DiagOpts);
-  IntrusiveRefCntPtr<DiagnosticsEngine> Diags =
-      CompilerInstance::createDiagnostics(FileMgr->getVirtualFileSystem(),
-                                          *DiagOpts, &DC,
-                                          /*ShouldOwnClient=*/false);
-
-  // Although `Diagnostics` are used only for command-line parsing, the
-  // custom `DiagConsumer` might expect a `SourceManager` to be present.
-  SourceManager SrcMgr(*Diags, *FileMgr);
-  Diags->setSourceManager(&SrcMgr);
+  auto Diags = CompilerInstance::createDiagnostics(*FS, *DiagOpts, &DC,
+                                                   /*ShouldOwnClient=*/false);
+
   // DisableFree is modified by Tooling for running
   // in-process; preserve the original value, which is
   // always true for a driver invocation.
@@ -759,11 +754,11 @@ bool DependencyScanningWorker::scanDependencies(
 
   bool Success = false;
   if (CommandLine[1] == "-cc1") {
-    Success = createAndRunToolInvocation(CommandLine, Action, *FileMgr,
+    Success = createAndRunToolInvocation(CommandLine, Action, FS,
                                          PCHContainerOps, *Diags, Consumer);
   } else {
     Success = forEachDriverJob(
-        CommandLine, *Diags, *FileMgr, [&](const driver::Command &Cmd) {
+        CommandLine, *Diags, FS, [&](const driver::Command &Cmd) {
           if (StringRef(Cmd.getCreator().getName()) != "clang") {
             // Non-clang command. Just pass through to the dependency
             // consumer.
@@ -782,7 +777,7 @@ bool DependencyScanningWorker::scanDependencies(
           // system to ensure that any file system requests that
           // are made by the driver do not go through the
           // dependency scanning filesystem.
-          return createAndRunToolInvocation(std::move(Argv), Action, *FileMgr,
+          return createAndRunToolInvocation(std::move(Argv), Action, FS,
                                             PCHContainerOps, *Diags, Consumer);
         });
   }

Copy link
Contributor

@qiongsiwu qiongsiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG this is so nice! Thanks so much for doing it! LGTM!

Copy link
Collaborator

@benlangmuir benlangmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanner uses a closed set of consumers and none need these objects to exist.

I'm not sure what I think of this. The scanner has a library interface that allows an arbitrary diagnostic consumer. On the other hand, maybe we shouldn't be assuming a source manager in diagnostic consumers in the first place...

@jansvoboda11
Copy link
Contributor Author

The scanner uses a closed set of consumers and none need these objects to exist.

I'm not sure what I think of this. The scanner has a library interface that allows an arbitrary diagnostic consumer. On the other hand, maybe we shouldn't be assuming a source manager in diagnostic consumers in the first place...

I'm leaning towards not assuming that in diagnostic consumers. The DiagnosticsEngine is commonly used in contexts where there's no SourceManager, so this never made sense to me.

@jansvoboda11 jansvoboda11 requested a review from benlangmuir July 22, 2025 16:19
@jansvoboda11 jansvoboda11 merged commit aa1b416 into llvm:main Jul 22, 2025
9 checks passed
@jansvoboda11 jansvoboda11 deleted the scanner-simplifications branch July 22, 2025 17:10
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
The dependency scanner was initially using a fair amount of
infrastructure provided by the `clangTooling` library. Over time, the
needs for bespoke handling of command lines grew and the overlap with
the tooling library kept shrinking. I don't think the library provides
any value anymore.

I decided to remove the dependency and only reimplement the small bits
required by the scanner.

This allowed for a nice simplification, where we no longer need to
create temporary dummy `FileManager` instances (mis-named as
`DriverFileMgr` in some parts) and `SourceManager` instances to attach
to the `DiagnosticsEngine`. That code was copied from the tooling
library to support `DiagnosticConsumers` that expect these to exist. The
scanner uses a closed set of consumers and none need these objects to
exist.

The motivation for this (hopefully NFC) patch are some new restrictions
to how VFS's can be propagated in Clang that I'm working on.
qiongsiwu pushed a commit to qiongsiwu/llvm-project that referenced this pull request Oct 14, 2025
The dependency scanner was initially using a fair amount of
infrastructure provided by the `clangTooling` library. Over time, the
needs for bespoke handling of command lines grew and the overlap with
the tooling library kept shrinking. I don't think the library provides
any value anymore.

I decided to remove the dependency and only reimplement the small bits
required by the scanner.

This allowed for a nice simplification, where we no longer need to
create temporary dummy `FileManager` instances (mis-named as
`DriverFileMgr` in some parts) and `SourceManager` instances to attach
to the `DiagnosticsEngine`. That code was copied from the tooling
library to support `DiagnosticConsumers` that expect these to exist. The
scanner uses a closed set of consumers and none need these objects to
exist.

The motivation for this (hopefully NFC) patch are some new restrictions
to how VFS's can be propagated in Clang that I'm working on.

(cherry picked from commit aa1b416)
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Oct 18, 2025
The dependency scanner was initially using a fair amount of
infrastructure provided by the `clangTooling` library. Over time, the
needs for bespoke handling of command lines grew and the overlap with
the tooling library kept shrinking. I don't think the library provides
any value anymore.

I decided to remove the dependency and only reimplement the small bits
required by the scanner.

This allowed for a nice simplification, where we no longer need to
create temporary dummy `FileManager` instances (mis-named as
`DriverFileMgr` in some parts) and `SourceManager` instances to attach
to the `DiagnosticsEngine`. That code was copied from the tooling
library to support `DiagnosticConsumers` that expect these to exist. The
scanner uses a closed set of consumers and none need these objects to
exist.

The motivation for this (hopefully NFC) patch are some new restrictions
to how VFS's can be propagated in Clang that I'm working on.
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Oct 23, 2025
The dependency scanner was initially using a fair amount of
infrastructure provided by the `clangTooling` library. Over time, the
needs for bespoke handling of command lines grew and the overlap with
the tooling library kept shrinking. I don't think the library provides
any value anymore.

I decided to remove the dependency and only reimplement the small bits
required by the scanner.

This allowed for a nice simplification, where we no longer need to
create temporary dummy `FileManager` instances (mis-named as
`DriverFileMgr` in some parts) and `SourceManager` instances to attach
to the `DiagnosticsEngine`. That code was copied from the tooling
library to support `DiagnosticConsumers` that expect these to exist. The
scanner uses a closed set of consumers and none need these objects to
exist.

The motivation for this (hopefully NFC) patch are some new restrictions
to how VFS's can be propagated in Clang that I'm working on.
qiongsiwu added a commit to swiftlang/llvm-project that referenced this pull request Oct 27, 2025
…Instance Sharing (#11631)

* [clang][deps] Remove dependency on `tooling::ToolAction` (llvm#149904)

The dependency scanner was initially using a fair amount of
infrastructure provided by the `clangTooling` library. Over time, the
needs for bespoke handling of command lines grew and the overlap with
the tooling library kept shrinking. I don't think the library provides
any value anymore.

I decided to remove the dependency and only reimplement the small bits
required by the scanner.

This allowed for a nice simplification, where we no longer need to
create temporary dummy `FileManager` instances (mis-named as
`DriverFileMgr` in some parts) and `SourceManager` instances to attach
to the `DiagnosticsEngine`. That code was copied from the tooling
library to support `DiagnosticConsumers` that expect these to exist. The
scanner uses a closed set of consumers and none need these objects to
exist.

The motivation for this (hopefully NFC) patch are some new restrictions
to how VFS's can be propagated in Clang that I'm working on.

(cherry picked from commit aa1b416)

* Reland "[clang] Delay normalization of `-fmodules-cache-path` (llvm#150123)"

This reverts commit 613caa9, essentially
reapplying 4a4bdde after moving
`normalizeModuleCachePath` from clangFrontend to clangLex.

This PR is part of an effort to remove file system usage from the
command line parsing code. The reason for that is that it's impossible
to do file system access correctly without a configured VFS, and the VFS
can only be configured after the command line is parsed. I don't want to
intertwine command line parsing and VFS configuration, so I decided to
perform the file system access after the command line is parsed and the
VFS is configured - ideally right before the file system entity is used
for the first time.

This patch delays normalization of the module cache path until
`CompilerInstance` is asked for the cache path in the current
compilation context.

(cherry picked from commit 55bef46)

* NFC: Clean up of IntrusiveRefCntPtr construction from raw pointers. (llvm#151545)

Handles clang::DiagnosticsEngine and clang::DiagnosticIDs.

For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to
convenience method `DiagnosticIDs::create()`.

Part of cleanup llvm#151026

(cherry picked from commit c7f3437)

 Conflicts:
	clang/tools/driver/cc1_main.cpp
	clang/unittests/Driver/DXCModeTest.cpp
	clang/unittests/Driver/SimpleDiagnosticConsumer.h
	clang/unittests/Frontend/SearchPathTest.cpp
	clang/unittests/Lex/HeaderSearchTest.cpp
	clang/unittests/Tooling/RewriterTestContext.h

* NFC: Clean up of IntrusiveRefCntPtr construction from raw pointers. (llvm#151782)

This commit handles the following types:
- clang::ExternalASTSource
- clang::TargetInfo
- clang::ASTContext
- clang::SourceManager
- clang::FileManager

Part of cleanup llvm#151026

(cherry picked from commit 4205da0)

 Conflicts:
	clang/lib/Frontend/ASTUnit.cpp
	clang/lib/Frontend/ChainedIncludesSource.cpp
	clang/lib/Frontend/CompilerInstance.cpp

* Merge commit '30633f308941' from llvm.org/main into next

(cherry picked from commit 95ea104)

 Conflicts:
	clang/include/clang/Frontend/CompilerInstance.h
	clang/lib/Frontend/CompilerInstance.cpp

* Merge pull request #11450 from swiftlang/jan_svoboda/cas-fix-early-vfs

[clang] Fix CAS initialization after upstream llvm#158381

(cherry picked from commit 6d73002)

* [clang] Avoid reparsing VFS overlay files for module dep collector (llvm#158372)

This PR uses the new-ish `llvm::vfs::FileSystem::visit()` interface to
collect VFS overlay entries from an existing `FileSystem` instance
rather than parsing the VFS YAML file anew. This prevents duplicate
diagnostics as observed by `clang/test/VFS/broken-vfs-module-dep.c`.

(cherry picked from commit 4957c47)

* [clang] Don't fail `ExecuteCompilerInvocation()` due to caller errors (llvm#158695)

This PR changes the behavior of `clang::ExecuteCompilerInvocation()` so
that it only returns early when the `DiagnosticsEngine` emitted errors
**within** the function. Handling errors emitted before the function got
called is a responsibility of the caller. Necessary for llvm#158381.

(cherry picked from commit f33fb0d)

* [clang] Only set non-empty bypass to scan VFS (llvm#159605)

Normalizing an empty modules cache path results in an incorrect
non-empty path (the working directory). This PR conditionalizes more
code to avoid this. Tested downstream by swift/llvm-project and the
`DependencyScanningCAPITests.DependencyScanningFSCacheOutOfDate` unit
test.

(cherry picked from commit 5a339b0)

* Merge commit '0e35f56d40d3' from llvm.org/main into next

(cherry picked from commit 3efcc0f)

 Conflicts:
	clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp

* [clang] NFCI: Clean up `CompilerInstance::create{File,Source}Manager()` (llvm#160748)

The `CompilerInstance::createSourceManager()` function currently accepts
the `FileManager` to be used. However, all clients call
`CompilerInstance::createFileManager()` prior to creating the
`SourceManager`, and it never makes sense to use a `FileManager` in the
`SourceManager` that's different from the rest of the compiler. Passing
the `FileManager` explicitly is redundant, error-prone, and deviates
from the style of other `CompilerInstance` initialization APIs.

This PR therefore removes the `FileManager` parameter from
`createSourceManager()` and also stops returning the `FileManager`
pointer from `createFileManager()`, since that was its primary use. Now,
`createSourceManager()` internally calls `getFileManager()` instead.

(cherry picked from commit b86ddae)

 Conflicts:
	clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp

* Merge commit '436861645247' from llvm.org/main into next

(cherry picked from commit 286ea7d)

 Conflicts:
	clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp

* [clang] Pass VFS into `ASTUnit::LoadFromASTFile()` (llvm#159166)

This PR makes the `VFS` parameter to `ASTUnit::LoadFromASTFile()`
required and explicit, rather than silently defaulting to the real file
system. This makes it easy to correctly propagate the fully-configured
VFS and load any input files like the rest of the compiler does.

(cherry picked from commit cda542d)

* Fix a line missing when merging 30633f3

* [clang][deps] Fix a use-after-free from expanding response files (llvm#164676)

In 4368616 we accidentally moved uses of command-line args saved
into a bump pointer allocator during response file expansion out of
scope of the allocator. Also, the test that should have caught this (at
least with asan) was not working correctly because clang-scan-deps was
expanding response files itself during argument adjustment rather than
the underlying scanner library.

rdar://162720059
(cherry picked from commit 3e6f696)

---------

Co-authored-by: Jan Svoboda <jan_svoboda@apple.com>
Co-authored-by: James Y Knight <jyknight@google.com>
Co-authored-by: git apple-llvm automerger <am@git-apple-llvm>
Co-authored-by: Ben Langmuir <blangmuir@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants