-
Notifications
You must be signed in to change notification settings - Fork 340
[libclang] Add API to generate a reproducer for the explicitly-built modules. #10577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
// Test generating a reproducer for a modular build where required modules are | ||
// built explicitly as separate steps. | ||
|
||
// RUN: rm -rf %t | ||
// RUN: split-file %s %t | ||
// | ||
// RUN: c-index-test core -gen-deps-reproducer -working-dir %t \ | ||
// RUN: -- clang-executable -c %t/reproducer.c -o %t/reproducer.o \ | ||
// RUN: -fmodules -fmodules-cache-path=%t | FileCheck %t/reproducer.c | ||
|
||
// Test a failed attempt at generating a reproducer. | ||
// RUN: not c-index-test core -gen-deps-reproducer -working-dir %t \ | ||
// RUN: -- clang-executable -c %t/failed-reproducer.c -o %t/reproducer.o \ | ||
// RUN: -fmodules -fmodules-cache-path=%t 2>&1 | FileCheck %t/failed-reproducer.c | ||
|
||
// Test the content of a reproducer script. | ||
// RUN: c-index-test core -gen-deps-reproducer -working-dir %t -o %t/repro-content \ | ||
// RUN: -- clang-executable -c %t/reproducer.c -o %t/reproducer.o \ | ||
// RUN: -fmodules -fmodules-cache-path=%t | ||
// RUN: FileCheck %t/script-expectations.txt --input-file %t/repro-content/reproducer.sh | ||
|
||
//--- modular-header.h | ||
void fn_in_modular_header(void); | ||
|
||
//--- module.modulemap | ||
module Test { header "modular-header.h" export * } | ||
|
||
//--- reproducer.c | ||
// CHECK: Sources and associated run script(s) are located at: | ||
#include "modular-header.h" | ||
|
||
void test(void) { | ||
fn_in_modular_header(); | ||
} | ||
|
||
//--- failed-reproducer.c | ||
// CHECK: fatal error: 'non-existing-header.h' file not found | ||
#include "non-existing-header.h" | ||
|
||
//--- script-expectations.txt | ||
CHECK: clang-executable | ||
CHECK: -fmodule-file=Test=reproducer.cache/explicitly-built-modules/Test-{{.*}}.pcm |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -317,6 +317,150 @@ void clang_experimental_DependencyScannerWorkerScanSettings_dispose( | |
delete unwrap(Settings); | ||
} | ||
|
||
namespace { | ||
// Helper class to capture a returnable error code and to return a formatted | ||
// message in a provided CXString pointer. | ||
class MessageEmitter { | ||
const CXErrorCode ErrorCode; | ||
CXString *OutputString; | ||
std::string Buffer; | ||
llvm::raw_string_ostream Stream; | ||
|
||
public: | ||
MessageEmitter(CXErrorCode Code, CXString *Output) | ||
: ErrorCode(Code), OutputString(Output), Stream(Buffer) {} | ||
~MessageEmitter() { | ||
if (OutputString) | ||
*OutputString = clang::cxstring::createDup(Buffer.c_str()); | ||
} | ||
|
||
operator CXErrorCode() const { return ErrorCode; } | ||
|
||
template <typename T> MessageEmitter &operator<<(const T &t) { | ||
Stream << t; | ||
return *this; | ||
} | ||
}; | ||
} // end anonymous namespace | ||
|
||
enum CXErrorCode clang_experimental_DependencyScanner_generateReproducer( | ||
int argc, const char *const *argv, const char *ModuleName, | ||
const char *WorkingDirectory, const char *ReproducerLocation, | ||
bool UseUniqueReproducerName, CXString *MessageOut) { | ||
auto Report = [MessageOut](CXErrorCode ErrorCode) -> MessageEmitter { | ||
return MessageEmitter(ErrorCode, MessageOut); | ||
}; | ||
auto ReportFailure = [&Report]() -> MessageEmitter { | ||
return Report(CXError_Failure); | ||
}; | ||
|
||
if (argc < 2 || !argv) | ||
return Report(CXError_InvalidArguments) << "missing compilation command"; | ||
if (!WorkingDirectory) | ||
return Report(CXError_InvalidArguments) << "missing working directory"; | ||
if (!UseUniqueReproducerName && !ReproducerLocation) | ||
return Report(CXError_InvalidArguments) | ||
<< "non-unique reproducer is allowed only in a custom location"; | ||
|
||
CASOptions CASOpts; | ||
IntrusiveRefCntPtr<llvm::cas::CachingOnDiskFileSystem> FS; | ||
DependencyScanningService DepsService( | ||
ScanningMode::DependencyDirectivesScan, ScanningOutputFormat::Full, | ||
CASOpts, /*CAS=*/nullptr, /*ActionCache=*/nullptr, FS); | ||
DependencyScanningTool DepsTool(DepsService); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does this API create its own service/worker/tool instead of taking a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I wanted the reproducer scan to be independent from previous scans, that's why don't use an existing worker. It is a trade-off, not a uniquely correct approach. Reproducers should be useful in general, not only for debugging issues with EBM, that's why I'm leaning towards from-scratch scanning. And I hope it is easier to reason about than to take into account an invisible worker's state. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you clarify what specifically you want to be independent from? Is it the file system cache? |
||
|
||
llvm::SmallString<128> ReproScriptPath; | ||
int ScriptFD; | ||
if (ReproducerLocation) { | ||
if (auto EC = llvm::sys::fs::create_directories(ReproducerLocation)) | ||
return ReportFailure() << "failed to create a reproducer location '" | ||
<< ReproducerLocation << "'\n" | ||
<< EC.message(); | ||
SmallString<128> Path(ReproducerLocation); | ||
llvm::sys::path::append(Path, "reproducer"); | ||
const char *UniqueSuffix = UseUniqueReproducerName ? "-%%%%%%" : ""; | ||
if (auto EC = llvm::sys::fs::createUniqueFile(Path + UniqueSuffix + ".sh", | ||
ScriptFD, ReproScriptPath)) | ||
return ReportFailure() << "failed to create a reproducer script file\n" | ||
<< EC.message(); | ||
} else { | ||
if (auto EC = llvm::sys::fs::createTemporaryFile( | ||
"reproducer", "sh", ScriptFD, ReproScriptPath)) { | ||
return ReportFailure() << "failed to create a reproducer script file\n" | ||
<< EC.message(); | ||
} | ||
} | ||
SmallString<128> FileCachePath = ReproScriptPath; | ||
llvm::sys::path::replace_extension(FileCachePath, ".cache"); | ||
|
||
std::string FileCacheName = llvm::sys::path::filename(FileCachePath).str(); | ||
auto LookupOutput = [&FileCacheName](const ModuleDeps &MD, | ||
ModuleOutputKind MOK) -> std::string { | ||
if (MOK != ModuleOutputKind::ModuleFile) | ||
return ""; | ||
return FileCacheName + "/explicitly-built-modules/" + | ||
MD.ID.ModuleName + "-" + MD.ID.ContextHash + ".pcm"; | ||
}; | ||
|
||
std::vector<std::string> Compilation{argv, argv + argc}; | ||
llvm::DenseSet<ModuleID> AlreadySeen; | ||
auto TUDepsOrErr = DepsTool.getTranslationUnitDependencies( | ||
Compilation, WorkingDirectory, AlreadySeen, std::move(LookupOutput)); | ||
if (!TUDepsOrErr) | ||
return ReportFailure() << "failed to generate a reproducer\n" | ||
<< toString(TUDepsOrErr.takeError()); | ||
|
||
TranslationUnitDeps TU = *TUDepsOrErr; | ||
llvm::raw_fd_ostream ScriptOS(ScriptFD, /*shouldClose=*/true); | ||
ScriptOS << "# Original command:\n#"; | ||
for (StringRef Arg : Compilation) | ||
ScriptOS << ' ' << Arg; | ||
ScriptOS << "\n\n"; | ||
|
||
ScriptOS << "# Dependencies:\n"; | ||
std::string ReproExecutable = std::string(argv[0]); | ||
auto PrintArguments = [&ReproExecutable, | ||
&FileCacheName](llvm::raw_fd_ostream &OS, | ||
ArrayRef<std::string> Arguments) { | ||
OS << ReproExecutable; | ||
for (int I = 0, E = Arguments.size(); I < E; ++I) | ||
OS << ' ' << Arguments[I]; | ||
OS << " -ivfsoverlay \"" << FileCacheName << "/vfs/vfs.yaml\""; | ||
OS << '\n'; | ||
}; | ||
for (ModuleDeps &Dep : TU.ModuleGraph) | ||
PrintArguments(ScriptOS, Dep.getBuildArguments()); | ||
ScriptOS << "\n# Translation unit:\n"; | ||
for (const Command &BuildCommand : TU.Commands) | ||
PrintArguments(ScriptOS, BuildCommand.Arguments); | ||
|
||
SmallString<128> VFSCachePath = FileCachePath; | ||
llvm::sys::path::append(VFSCachePath, "vfs"); | ||
std::string VFSCachePathStr = VFSCachePath.str().str(); | ||
llvm::FileCollector FileCollector(VFSCachePathStr, | ||
/*OverlayRoot=*/VFSCachePathStr); | ||
for (const auto &FileDep : TU.FileDeps) { | ||
FileCollector.addFile(FileDep); | ||
} | ||
for (ModuleDeps &ModuleDep : TU.ModuleGraph) { | ||
ModuleDep.forEachFileDep([&FileCollector](StringRef FileDep) { | ||
FileCollector.addFile(FileDep); | ||
}); | ||
} | ||
if (FileCollector.copyFiles(/*StopOnError=*/true)) | ||
return ReportFailure() | ||
<< "failed to copy the files used for the compilation"; | ||
SmallString<128> VFSOverlayPath = VFSCachePath; | ||
llvm::sys::path::append(VFSOverlayPath, "vfs.yaml"); | ||
if (FileCollector.writeMapping(VFSOverlayPath)) | ||
return ReportFailure() << "failed to write a VFS overlay mapping"; | ||
|
||
return Report(CXError_Success) | ||
<< "Created a reproducer. Sources and associated run script(s) are " | ||
"located at:\n " | ||
<< FileCachePath << "\n " << ReproScriptPath; | ||
} | ||
|
||
enum CXErrorCode clang_experimental_DependencyScannerWorker_getDepGraph( | ||
CXDependencyScannerWorker W, | ||
CXDependencyScannerWorkerScanSettings CXSettings, CXDepGraph *Out) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should use
CXDependencyScannerWorkerScanSettings
instead ofint argc, const char *const *argv, const char *ModuleName, const char *WorkingDirectory
for the basic compilation setup.How sure are we that
const char *ReproducerLocation, bool UseUniqueReproducerName,
are the only two reproducer-specific options we will ever want here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going back and forth about it and can be persuaded either way. My [weak] justification is that we don't pass the settings to a dependency service/worker/tool. So it's not a self-sufficient entity but more a convenience container that happens to be the same between different functions. So I avoid treating it as an independent entity.
For the potential enhancements I expect only something to help capture the scanner state. Unfortunately, I don't know what that would be. Maybe .pcm cache, maybe dumping stat cache, maybe something else. So far it is too vague and not the most important direction for the next steps. That's why I'd like to land something reasonably future-proof and iterate on it to see what else we might need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that exactly what we do? The difference is that you're creating a new one instead of using an existing one.
If you want to make this future proof the C API should probably be much more opaque about the configuration, similar to what we do with the worker scan settings where the settings become their own opaque type and we can add new C functions over time to add new settings to that configuration object without breaking the existing API.