-
Couldn't load subscription status.
- Fork 15k
[clang] Delay normalization of -fmodules-cache-path
#150123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang] Delay normalization of -fmodules-cache-path
#150123
Conversation
|
@llvm/pr-subscribers-clang-modules @llvm/pr-subscribers-clang Author: Jan Svoboda (jansvoboda11) ChangesThis PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until Full diff: https://github.com/llvm/llvm-project/pull/150123.diff 5 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 97b2eb9c44a88..5f2f69195a7bd 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3226,7 +3226,8 @@ defm declspec : BoolOption<"f", "declspec",
def fmodules_cache_path : Joined<["-"], "fmodules-cache-path=">, Group<i_Group>,
Flags<[]>, Visibility<[ClangOption, CC1Option]>,
MetaVarName<"<directory>">,
- HelpText<"Specify the module cache path">;
+ HelpText<"Specify the module cache path">,
+ MarshallingInfoString<HeaderSearchOpts<"ModuleCachePath">>;
def fmodules_user_build_path : Separate<["-"], "fmodules-user-build-path">, Group<i_Group>,
Flags<[]>, Visibility<[ClangOption, CC1Option]>,
MetaVarName<"<directory>">,
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index c7b82db292a14..82c80832bfeef 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -538,9 +538,14 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {
}
std::string CompilerInstance::getSpecificModuleCachePath(StringRef ModuleHash) {
+ if (getHeaderSearchOpts().ModuleCachePath.empty())
+ return "";
+
// Set up the module path, including the hash for the module-creation options.
SmallString<256> SpecificModuleCache(getHeaderSearchOpts().ModuleCachePath);
- if (!SpecificModuleCache.empty() && !getHeaderSearchOpts().DisableModuleHash)
+ FileMgr->makeAbsolutePath(SpecificModuleCache);
+ llvm::sys::path::remove_dots(SpecificModuleCache);
+ if (!getHeaderSearchOpts().DisableModuleHash)
llvm::sys::path::append(SpecificModuleCache, ModuleHash);
return std::string(SpecificModuleCache);
}
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index 3a36250da57a3..a172602a8c73b 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3306,9 +3306,6 @@ static void GenerateHeaderSearchArgs(const HeaderSearchOptions &Opts,
if (Opts.UseLibcxx)
GenerateArg(Consumer, OPT_stdlib_EQ, "libc++");
- if (!Opts.ModuleCachePath.empty())
- GenerateArg(Consumer, OPT_fmodules_cache_path, Opts.ModuleCachePath);
-
for (const auto &File : Opts.PrebuiltModuleFiles)
GenerateArg(Consumer, OPT_fmodule_file, File.first + "=" + File.second);
@@ -3411,8 +3408,7 @@ static void GenerateHeaderSearchArgs(const HeaderSearchOptions &Opts,
}
static bool ParseHeaderSearchArgs(HeaderSearchOptions &Opts, ArgList &Args,
- DiagnosticsEngine &Diags,
- const std::string &WorkingDir) {
+ DiagnosticsEngine &Diags) {
unsigned NumErrorsBefore = Diags.getNumErrors();
HeaderSearchOptions *HeaderSearchOpts = &Opts;
@@ -3425,17 +3421,6 @@ static bool ParseHeaderSearchArgs(HeaderSearchOptions &Opts, ArgList &Args,
if (const Arg *A = Args.getLastArg(OPT_stdlib_EQ))
Opts.UseLibcxx = (strcmp(A->getValue(), "libc++") == 0);
- // Canonicalize -fmodules-cache-path before storing it.
- SmallString<128> P(Args.getLastArgValue(OPT_fmodules_cache_path));
- if (!(P.empty() || llvm::sys::path::is_absolute(P))) {
- if (WorkingDir.empty())
- llvm::sys::fs::make_absolute(P);
- else
- llvm::sys::fs::make_absolute(WorkingDir, P);
- }
- llvm::sys::path::remove_dots(P);
- Opts.ModuleCachePath = std::string(P);
-
// Only the -fmodule-file=<name>=<file> form.
for (const auto *A : Args.filtered(OPT_fmodule_file)) {
StringRef Val = A->getValue();
@@ -5064,8 +5049,7 @@ bool CompilerInvocation::CreateFromArgsImpl(
InputKind DashX = Res.getFrontendOpts().DashX;
ParseTargetArgs(Res.getTargetOpts(), Args, Diags);
llvm::Triple T(Res.getTargetOpts().Triple);
- ParseHeaderSearchArgs(Res.getHeaderSearchOpts(), Args, Diags,
- Res.getFileSystemOpts().WorkingDir);
+ ParseHeaderSearchArgs(Res.getHeaderSearchOpts(), Args, Diags);
if (Res.getFrontendOpts().GenReducedBMI ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateReducedModuleInterface ||
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index a6957e54b66f1..4148caf93fd34 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -1172,51 +1172,36 @@ static bool cleanPathForOutput(FileManager &FileMgr,
return Changed | llvm::sys::path::remove_dots(Path);
}
-/// Adjusts the given filename to only write out the portion of the
-/// filename that is not part of the system root directory.
+/// Adjusts the given filename to only write out the portion of the filename
+/// that is not part of the base directory.
///
/// \param Filename the file name to adjust.
///
-/// \param BaseDir When non-NULL, the PCH file is a relocatable AST file and
-/// the returned filename will be adjusted by this root directory.
+/// \param BasePath When not empty, the AST file is relocatable and the returned
+/// filename will be adjusted to be relative to this path.
///
-/// \returns either the original filename (if it needs no adjustment) or the
-/// adjusted filename (which points into the @p Filename parameter).
-static const char *
-adjustFilenameForRelocatableAST(const char *Filename, StringRef BaseDir) {
- assert(Filename && "No file name to adjust?");
-
- if (BaseDir.empty())
- return Filename;
-
- // Verify that the filename and the system root have the same prefix.
- unsigned Pos = 0;
- for (; Filename[Pos] && Pos < BaseDir.size(); ++Pos)
- if (Filename[Pos] != BaseDir[Pos])
- return Filename; // Prefixes don't match.
-
- // We hit the end of the filename before we hit the end of the system root.
- if (!Filename[Pos])
- return Filename;
-
- // If there's not a path separator at the end of the base directory nor
- // immediately after it, then this isn't within the base directory.
- if (!llvm::sys::path::is_separator(Filename[Pos])) {
- if (!llvm::sys::path::is_separator(BaseDir.back()))
- return Filename;
- } else {
- // If the file name has a '/' at the current position, skip over the '/'.
- // We distinguish relative paths from absolute paths by the
- // absence of '/' at the beginning of relative paths.
- //
- // FIXME: This is wrong. We distinguish them by asking if the path is
- // absolute, which isn't the same thing. And there might be multiple '/'s
- // in a row. Use a better mechanism to indicate whether we have emitted an
- // absolute or relative path.
- ++Pos;
- }
+/// \returns true when \c Filename was adjusted, false otherwise.
+static bool adjustFilenameForRelocatableAST(SmallVectorImpl<char> &Filename,
+ StringRef BasePath) {
+ auto FileIt = llvm::sys::path::begin({Filename.begin(), Filename.size()});
+ auto FileEnd = llvm::sys::path::end({Filename.begin(), Filename.size()});
+
+ auto BaseIt = llvm::sys::path::begin(BasePath);
+ auto BaseEnd = llvm::sys::path::end(BasePath);
+
+ for (; FileIt != FileEnd && BaseIt != BaseEnd; ++FileIt, ++BaseIt)
+ if (*FileIt != *BaseIt)
+ return false;
+
+ if (FileIt == FileEnd)
+ return false;
+
+ SmallString<128> Clean;
+ for (; FileIt != FileEnd; ++FileIt)
+ llvm::sys::path::append(Clean, *FileIt);
- return Filename + Pos;
+ Filename.assign(Clean.begin(), Clean.end());
+ return true;
}
std::pair<ASTFileSignature, ASTFileSignature>
@@ -1699,7 +1684,7 @@ void ASTWriter::WriteControlBlock(Preprocessor &PP, StringRef isysroot) {
AddString(HSOpts.Sysroot, Record);
AddString(HSOpts.ResourceDir, Record);
- AddString(HSOpts.ModuleCachePath, Record);
+ AddPath(HSOpts.ModuleCachePath, Record);
AddString(HSOpts.ModuleUserBuildPath, Record);
Record.push_back(HSOpts.DisableModuleHash);
Record.push_back(HSOpts.ImplicitModuleMaps);
@@ -5329,16 +5314,8 @@ bool ASTWriter::PreparePathForOutput(SmallVectorImpl<char> &Path) {
return false;
bool Changed = cleanPathForOutput(PP->getFileManager(), Path);
-
- // Remove a prefix to make the path relative, if relevant.
- const char *PathBegin = Path.data();
- const char *PathPtr =
- adjustFilenameForRelocatableAST(PathBegin, BaseDirectory);
- if (PathPtr != PathBegin) {
- Path.erase(Path.begin(), Path.begin() + (PathPtr - PathBegin));
+ if (adjustFilenameForRelocatableAST(Path, BaseDirectory))
Changed = true;
- }
-
return Changed;
}
diff --git a/clang/test/Modules/modules-cache-path-canonicalization-output.c b/clang/test/Modules/modules-cache-path-canonicalization-output.c
new file mode 100644
index 0000000000000..ad71b69e5299e
--- /dev/null
+++ b/clang/test/Modules/modules-cache-path-canonicalization-output.c
@@ -0,0 +1,18 @@
+// This checks that implicitly-built modules produce identical PCM
+// files regardless of the spelling of the same module cache path.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fsyntax-only %t/tu.c \
+// RUN: -fmodules-cache-path=%t/cache -fdisable-module-hash
+// RUN: mv %t/cache/M.pcm %t/M.pcm
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fsyntax-only %t/tu.c \
+// RUN: -fmodules-cache-path=%t/./cache -fdisable-module-hash
+// RUN: diff %t/./cache/M.pcm %t/M.pcm
+
+//--- tu.c
+#include "M.h"
+//--- M.h
+//--- module.modulemap
+module M { header "M.h" }
|
ba931e2 to
a3221d4
Compare
|
Windows CI is hitting issues that look legit, I'll have to look into those. The Linux issue in lldb-api :: commands/statistics/basic/TestStats.py might be a fallout from #155023. |
|
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion, but basically LGTM
This reverts commit 4a4bdde. The Serialization library doesn't link Frontend, where CompilerInstance lives, causing link failures on some build bots.
This reverts commit 613caa9, essentially reapplying 4a4bdde after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context.
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/21662 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/205/builds/21639 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/203/builds/22850 Here is the relevant piece of the build log for the reference |
…50123)" This reverts commit 613caa9, essentially reapplying 4a4bdde after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context. (cherry picked from commit 55bef46)
…50123)" This reverts commit 613caa9, essentially reapplying 4a4bdde after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context. (cherry picked from commit 55bef46)
…50123)" This reverts commit 613caa9, essentially reapplying 4a4bdde after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context. (cherry picked from commit 55bef46)
PR #150123 changed how we normalize the modules cache path. Unfortunately, empty path would get normalized to the current working directory. This means that even explicitly-built PCMs that don't rely on the CWD now embed it, leading to surprising behavior. This PR fixes that by normalizing an empty modules cache path to an empty string.
…WD (#164840) PR llvm/llvm-project#150123 changed how we normalize the modules cache path. Unfortunately, empty path would get normalized to the current working directory. This means that even explicitly-built PCMs that don't rely on the CWD now embed it, leading to surprising behavior. This PR fixes that by normalizing an empty modules cache path to an empty string.
…50123)" This reverts commit 613caa9, essentially reapplying 4a4bdde after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context. (cherry picked from commit 55bef46)
…4840) PR llvm#150123 changed how we normalize the modules cache path. Unfortunately, empty path would get normalized to the current working directory. This means that even explicitly-built PCMs that don't rely on the CWD now embed it, leading to surprising behavior. This PR fixes that by normalizing an empty modules cache path to an empty string.
…4840) PR llvm#150123 changed how we normalize the modules cache path. Unfortunately, empty path would get normalized to the current working directory. This means that even explicitly-built PCMs that don't rely on the CWD now embed it, leading to surprising behavior. This PR fixes that by normalizing an empty modules cache path to an empty string.
…Instance Sharing (#11631) * [clang][deps] Remove dependency on `tooling::ToolAction` (llvm#149904) The dependency scanner was initially using a fair amount of infrastructure provided by the `clangTooling` library. Over time, the needs for bespoke handling of command lines grew and the overlap with the tooling library kept shrinking. I don't think the library provides any value anymore. I decided to remove the dependency and only reimplement the small bits required by the scanner. This allowed for a nice simplification, where we no longer need to create temporary dummy `FileManager` instances (mis-named as `DriverFileMgr` in some parts) and `SourceManager` instances to attach to the `DiagnosticsEngine`. That code was copied from the tooling library to support `DiagnosticConsumers` that expect these to exist. The scanner uses a closed set of consumers and none need these objects to exist. The motivation for this (hopefully NFC) patch are some new restrictions to how VFS's can be propagated in Clang that I'm working on. (cherry picked from commit aa1b416) * Reland "[clang] Delay normalization of `-fmodules-cache-path` (llvm#150123)" This reverts commit 613caa9, essentially reapplying 4a4bdde after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context. (cherry picked from commit 55bef46) * NFC: Clean up of IntrusiveRefCntPtr construction from raw pointers. (llvm#151545) Handles clang::DiagnosticsEngine and clang::DiagnosticIDs. For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to convenience method `DiagnosticIDs::create()`. Part of cleanup llvm#151026 (cherry picked from commit c7f3437) Conflicts: clang/tools/driver/cc1_main.cpp clang/unittests/Driver/DXCModeTest.cpp clang/unittests/Driver/SimpleDiagnosticConsumer.h clang/unittests/Frontend/SearchPathTest.cpp clang/unittests/Lex/HeaderSearchTest.cpp clang/unittests/Tooling/RewriterTestContext.h * NFC: Clean up of IntrusiveRefCntPtr construction from raw pointers. (llvm#151782) This commit handles the following types: - clang::ExternalASTSource - clang::TargetInfo - clang::ASTContext - clang::SourceManager - clang::FileManager Part of cleanup llvm#151026 (cherry picked from commit 4205da0) Conflicts: clang/lib/Frontend/ASTUnit.cpp clang/lib/Frontend/ChainedIncludesSource.cpp clang/lib/Frontend/CompilerInstance.cpp * Merge commit '30633f308941' from llvm.org/main into next (cherry picked from commit 95ea104) Conflicts: clang/include/clang/Frontend/CompilerInstance.h clang/lib/Frontend/CompilerInstance.cpp * Merge pull request #11450 from swiftlang/jan_svoboda/cas-fix-early-vfs [clang] Fix CAS initialization after upstream llvm#158381 (cherry picked from commit 6d73002) * [clang] Avoid reparsing VFS overlay files for module dep collector (llvm#158372) This PR uses the new-ish `llvm::vfs::FileSystem::visit()` interface to collect VFS overlay entries from an existing `FileSystem` instance rather than parsing the VFS YAML file anew. This prevents duplicate diagnostics as observed by `clang/test/VFS/broken-vfs-module-dep.c`. (cherry picked from commit 4957c47) * [clang] Don't fail `ExecuteCompilerInvocation()` due to caller errors (llvm#158695) This PR changes the behavior of `clang::ExecuteCompilerInvocation()` so that it only returns early when the `DiagnosticsEngine` emitted errors **within** the function. Handling errors emitted before the function got called is a responsibility of the caller. Necessary for llvm#158381. (cherry picked from commit f33fb0d) * [clang] Only set non-empty bypass to scan VFS (llvm#159605) Normalizing an empty modules cache path results in an incorrect non-empty path (the working directory). This PR conditionalizes more code to avoid this. Tested downstream by swift/llvm-project and the `DependencyScanningCAPITests.DependencyScanningFSCacheOutOfDate` unit test. (cherry picked from commit 5a339b0) * Merge commit '0e35f56d40d3' from llvm.org/main into next (cherry picked from commit 3efcc0f) Conflicts: clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp * [clang] NFCI: Clean up `CompilerInstance::create{File,Source}Manager()` (llvm#160748) The `CompilerInstance::createSourceManager()` function currently accepts the `FileManager` to be used. However, all clients call `CompilerInstance::createFileManager()` prior to creating the `SourceManager`, and it never makes sense to use a `FileManager` in the `SourceManager` that's different from the rest of the compiler. Passing the `FileManager` explicitly is redundant, error-prone, and deviates from the style of other `CompilerInstance` initialization APIs. This PR therefore removes the `FileManager` parameter from `createSourceManager()` and also stops returning the `FileManager` pointer from `createFileManager()`, since that was its primary use. Now, `createSourceManager()` internally calls `getFileManager()` instead. (cherry picked from commit b86ddae) Conflicts: clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp * Merge commit '436861645247' from llvm.org/main into next (cherry picked from commit 286ea7d) Conflicts: clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp * [clang] Pass VFS into `ASTUnit::LoadFromASTFile()` (llvm#159166) This PR makes the `VFS` parameter to `ASTUnit::LoadFromASTFile()` required and explicit, rather than silently defaulting to the real file system. This makes it easy to correctly propagate the fully-configured VFS and load any input files like the rest of the compiler does. (cherry picked from commit cda542d) * Fix a line missing when merging 30633f3 * [clang][deps] Fix a use-after-free from expanding response files (llvm#164676) In 4368616 we accidentally moved uses of command-line args saved into a bump pointer allocator during response file expansion out of scope of the allocator. Also, the test that should have caught this (at least with asan) was not working correctly because clang-scan-deps was expanding response files itself during argument adjustment rather than the underlying scanner library. rdar://162720059 (cherry picked from commit 3e6f696) --------- Co-authored-by: Jan Svoboda <jan_svoboda@apple.com> Co-authored-by: James Y Knight <jyknight@google.com> Co-authored-by: git apple-llvm automerger <am@git-apple-llvm> Co-authored-by: Ben Langmuir <blangmuir@apple.com>
…4840) PR llvm#150123 changed how we normalize the modules cache path. Unfortunately, empty path would get normalized to the current working directory. This means that even explicitly-built PCMs that don't rely on the CWD now embed it, leading to surprising behavior. This PR fixes that by normalizing an empty modules cache path to an empty string.
This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time.
This patch delays normalization of the module cache path until
CompilerInstanceis asked for the cache path in the current compilation context. This needed some rewriting of the serialization function that makes a path relative, which was triggering an OOB access for paths in the form of"<BasePath>/.".