Skip to content

[Driver][SYCL] Improve FPGA archive device unbundling with AOCO #9572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 30, 2023

Conversation

maksimsab
Copy link
Contributor

@maksimsab maksimsab commented May 23, 2023

AOCO specific objects can be added to an archive. When this occurs,
the archive was being completely ignored, causing issues with device
compilation. Take advantage of the additional -excluded-targets option
from the clang-offload-bundler so we can process these archives for
the needed device information, excluding potential unbundling of the
target device in the presense of aoco binaries in that same object.

@maksimsab maksimsab temporarily deployed to aws May 24, 2023 13:50 — with GitHub Actions Inactive
@asudarsa
Copy link
Contributor

Hi @maksimsab

Can you please add some description to this PR?

Thanks

@maksimsab maksimsab temporarily deployed to aws May 24, 2023 14:21 — with GitHub Actions Inactive
Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bundler changes LGTM. Thanks

@maksimsab maksimsab temporarily deployed to aws May 24, 2023 15:34 — with GitHub Actions Inactive
AOCO specific objects can be added to an archive.  When this occurs,
the archive was being completely ignored, causing issues with device
compilation.  Take advantage of the additional -excluded-targets option
from the clang-offload-bundler so we can process these archives for
the needed device information, excluding potential unbundling of the
target device in the presense of aoco binaries in that same object.

Co-authored-by: Michael Toguchi <michael.d.toguchi@intel.com>
@maksimsab maksimsab changed the title [SYCL][NFC] add excluded-targets option in clang-offload-bundler [Driver][SYCL] Improve FPGA archive device unbundling with AOCO May 24, 2023
@maksimsab maksimsab marked this pull request as ready for review May 24, 2023 15:48
@maksimsab maksimsab requested review from a team as code owners May 24, 2023 15:48
@maksimsab maksimsab requested a review from mdtoguchi May 24, 2023 15:50
@maksimsab maksimsab temporarily deployed to aws May 24, 2023 16:15 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 24, 2023 16:47 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 25, 2023 17:02 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 25, 2023 17:34 — with GitHub Actions Inactive
Copy link
Contributor

@hchilama hchilama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver changes LGTM, thanks

@@ -1215,6 +1241,66 @@ class ArchiveFileHandler final : public FileHandler {
Error WriteBundle(raw_fd_ostream &OS, MemoryBuffer &Input) override {
llvm_unreachable("unsupported for the ArchiveFileHandler");
}

private:
Copy link
Contributor

@asudarsa asudarsa May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are opening and parsing the file here an additional time for every child. This can prove expensive. There must be a way to do this 'exclusion' on the fly using a single parse of the file. But we can do that as a subsequent work. Please add a TODO here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this optimizations.
There is a cost we should always consider is that this tool is from community. And we should resolve conflicts in sycl-web every time something comes from community in this functionality. So if we change the archive processing significantly in terms of a code then conflict resolving could turn into hard work for one who will maintain this stuff (it is me or somebody from our team).

Copy link
Contributor Author

@maksimsab maksimsab May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surely not a blocker for this PR. But we need to address this and atleast verify that this does not increase compile time by a lot. Thanks

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Few opportunities for optimization can be addressed later. Thanks

@maksimsab maksimsab temporarily deployed to aws May 26, 2023 11:46 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 26, 2023 12:17 — with GitHub Actions Inactive
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, there are no tests for clang-offload-bundler changes in this PR.

Comment on lines +1276 to +1282
// NOTE: "-sycldevice" Triple component has been deprecated.
// However, it still can be met in libraries that have been compiled before
// deprecation. For example, here Triple might be the following:
// sycl-fpga_aoco-intel-unknown-sycldevice
//
// The workaround is to strip this Triple component if it is present.
Triple.consume_back("-sycldevice");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have dropped support for such binaries, no need to have that workaround anymore, see #9408

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it looks like FPGA libraries still depend on it. We asked them how did it happen. But it doesn't look like we can drop support of "sycldevice" env.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have effectively done that already. We do aim to preserve backwards compatibility generally, but right now we are in ABI-breaking window and we are allowed to cleanup legacy code, thus breaking backwards compatibility.

Comment on lines 1301 to 1305
for (const auto &TargetName : TargetNames)
if (CheckTargetIsExcluded(TargetName))
return true;

return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return std::count_if() > 0 here as well? cppreference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckTargetIsExcluded is a method. Thus, it can't be passed as UnaryPredicate. Also it can't be made static because it uses BundlerConfig. Trying to make it static and pass BundlerConfig through would make the code look ugly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be converted into a lamda, since it is fairly small method. But feel free to disregard this comment or apply in a separate follow-up patch

@maksimsab maksimsab temporarily deployed to aws May 26, 2023 15:03 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 26, 2023 15:34 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 26, 2023 16:11 — with GitHub Actions Inactive
// RUN: llc -filetype=obj -o %t-device2.o %t-device2.bc
// RUN: llc -filetype=obj -o %t-aoco.o %t-aoco.bc
// RUN: llc -filetype=obj -o %t-host.o %t-host.bc
// RUN: llc -filetype=obj -o %t-host2.o %t-host2.bc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to use a pre-built binary/archive and put it in Inputs/SYCL and just test the behavior of clang-offload-bundler? It reduces external dependencies on other tools.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can put it into Inputs folder. However, I still have dependency on cat, xargs, strings and grep tools in the final line of the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@maksimsab maksimsab temporarily deployed to aws May 26, 2023 17:53 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 26, 2023 18:24 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 30, 2023 12:06 — with GitHub Actions Inactive
@maksimsab maksimsab temporarily deployed to aws May 30, 2023 12:38 — with GitHub Actions Inactive
@AlexeySachkov AlexeySachkov merged commit 4ed3676 into intel:sycl May 30, 2023
@aelovikov-intel
Copy link
Contributor

I see

OffloadBundler.cpp:1259:14: error: call to deleted constructor of 'llvm::Error'
      return Err;

locally.

@maksimsab maksimsab deleted the offload_bundler branch May 30, 2023 17:07
aelovikov-intel added a commit that referenced this pull request May 30, 2023
I believe the issue has been introduced in #9572.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants