Skip to content

Implement SymbolInfo::getFilename() on Win32 #62577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

grynspan
Copy link
Contributor

Here we are again! This change implements SymbolInfo::getFilename() on Win32. It lazily loads the string the first time somebody requests it (since most callers don't ask for it.) It is necessary to change call sites because they currently assume the string is immortal, which can lead to a use-after-free scenario on Windows. So callers must now copy the string if they keep it longer than the calling scope.

As discussed in my last PR, I reviewed my use of auto with SymbolInfo. I changed uses of it when getting fields so the type is explicit, but I left it in for the return type of SymbolInfo::lookup() because, IMHO, it's clear what the type will be in that context, and typing out llvm::Optional<SymbolInfo> consumes a significant portion of our 80-column limit.

Because the filename may include Unicode characters, I need to convert from UTF-16 (wchar_t on Windows) to UTF-8 (char.) So I moved the copyUTF8FromWide() function out from CommandLine.cpp into its own file in the runtime and reused it in SymbolInfo.cpp.

This change also stores the input address to SymbolInfo::lookup() and makes it accessible via getAddress()—I have grand plans for it later!

@grynspan
Copy link
Contributor Author

@swift-ci please test

Copy link
Contributor

@mikeash mikeash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two tiny nits, looks good overall.

@grynspan grynspan requested review from compnerd and mikeash December 14, 2022 16:46
@grynspan grynspan force-pushed the jgrynspan/add-symbolinfo-filename-to-win32 branch from 30b7be5 to 29dc84a Compare December 14, 2022 18:47
@grynspan
Copy link
Contributor Author

@swift-ci please test

1 similar comment
@grynspan
Copy link
Contributor Author

@swift-ci please test

@grynspan
Copy link
Contributor Author

@swift-ci please test windows

1 similar comment
@grynspan
Copy link
Contributor Author

@swift-ci please test windows

@grynspan
Copy link
Contributor Author

@swift-ci please clean build windows

@grynspan
Copy link
Contributor Author

Blocked by swiftlang/llvm-project#5792.

@al45tair
Copy link
Contributor

So I moved the copyUTF8FromWide() function out from CommandLine.cpp into its own file in the runtime and reused it in SymbolInfo.cpp.

Hmmm. See Win32.cpp/Win32.h in #62462. I need the same code there too (I didn't remove the copy from CommandLine.cpp however, but I probably should have.

@grynspan
Copy link
Contributor Author

So I moved the copyUTF8FromWide() function out from CommandLine.cpp into its own file in the runtime and reused it in SymbolInfo.cpp.

Hmmm. See Win32.cpp/Win32.h in #62462. I need the same code there too (I didn't remove the copy from CommandLine.cpp however, but I probably should have.

This change is just about to land—shall we coordinate here?

@grynspan
Copy link
Contributor Author

Waiting for Alastair to land some overlapping changes.

@grynspan grynspan force-pushed the jgrynspan/add-symbolinfo-filename-to-win32 branch from 9a7fd5c to 9ef116a Compare December 15, 2022 12:32
@grynspan
Copy link
Contributor Author

@swift-ci please test

Copy link
Contributor

@CodaFi CodaFi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not much of a Windows expert, so I can only offer my 👍 on the structure and defensiveness of the calls into and out of Win32

@grynspan
Copy link
Contributor Author

grynspan commented Jan 6, 2023

@al45tair Am I good to get back to this? :)

@al45tair
Copy link
Contributor

al45tair commented Jan 6, 2023

Think so, yes.

@grynspan grynspan force-pushed the jgrynspan/add-symbolinfo-filename-to-win32 branch from 9ef116a to 84ab54a Compare January 6, 2023 17:29
@grynspan
Copy link
Contributor Author

grynspan commented Jan 6, 2023

@swift-ci please test

@grynspan grynspan requested review from al45tair and compnerd January 6, 2023 17:29
@grynspan
Copy link
Contributor Author

grynspan commented Jan 7, 2023

@swift-ci please test

defer {
_freeMetadataSectionName(cName)
}
let name = cName.flatMap { String(validatingUTF8: $0) } ?? "<unavailable>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:<unknown> might be better to match what we do for symbols which we cannot symbolize?

#pragma mark - DbgHelp library thread-safety

static LazyMutex dbgHelpMutex;
static HANDLE dbgHelpHandle = nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use INVALID_HANDLE_VALUE instead of nullptr? It seems more appropriate even though it may make the checked condition more verbose.

return nullptr;
}
#if defined(_WIN32) && !defined(__CYGWIN__)
return PathFindFileNameA(imagePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about this as technically pszImagePath is limited to MAX_PATH, which may or may not hold, particularly in the Swift world where we go through efforts to use NT style paths to avoid the length limitations. A far less elegant solution which may be safer (I haven't thought hard enough about it) is to use PathCchRemoveFileSpec and then use that to compute the offset into the buffer to compute the "file spec" (why Microsoft does not provide a PathCchGetFileSpec is unclear to me).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, when it says that's the limit, it's probably not kidding; most likely PathFindFileNameA has a buffer of that size that it uses to convert from ANSI to UTF-16.

Which also brings up the related point that this function will assume ANSI encoding and might produce unusual results if passed UTF-8, which is what we currently have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest would be to just find the last slash and truncate, same as what we do on Darwin/Linux.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remembering that both slashes and backslashes are valid path separators on Windows in most places (aside from \\?\ wide-string prefixed paths and NT-style paths, the latter of which you can't use anyway).

libraryName = libraryName.substr(libraryName.rfind('/')).substr(1);
const char *libraryName = syminfo->getImageFilename();
if (!libraryName) {
libraryName = "<unavailable>";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: <unknown> might be better to match what we do with functions which we cannot symbolize.

@grynspan
Copy link
Contributor Author

@al45tair Does this still make sense to complete given your backtrace work?

@al45tair
Copy link
Contributor

It'll be a little while before I get it all working on Windows, so if it's useful to fix this in the meantime I say go for it.

@grynspan
Copy link
Contributor Author

grynspan commented Jan 7, 2025

This branch is stale at this point. Closing.

@grynspan grynspan closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants