Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vulkan] Reduce descriptor sets, use official headers, improve allocator, remove module destructor #8452

Merged
merged 65 commits into from
Dec 9, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
7803cd8
Reduce resource usage for large pipelines by compiling one entry poin…
Oct 30, 2024
bf11f0f
Create a new emitter from scratch inside the add_kernel() method.
Oct 31, 2024
fde4019
Clang format pass
Oct 31, 2024
eb17d68
Fix ambiguous conversion from path to std::string
Oct 31, 2024
8afb1c0
Use explicit string method rather than constructor for path conversion.
Oct 31, 2024
2b3aaa9
Fix file path stem to string conversion.
Oct 31, 2024
4a8ff4c
Re-enable performance wrap test for Vulkan.
Oct 31, 2024
8cc1c32
Trigger CI for testing
Nov 5, 2024
a7586f1
Mark transfer buffers with SRC & DST usage bits to allow re-use witho…
Nov 6, 2024
3ba8b67
Change interal_error to user_assert for un-implemented features.
Nov 6, 2024
53e81ca
Add note to Vulkan.md describing validation layer usage and deb packa…
Nov 6, 2024
aae923a
Merge branch 'main' into dg/reduce-vulkan-desc-sets
Nov 6, 2024
61e59b7
Clang format
Nov 6, 2024
2a2bf38
Switch to halide_mutex for locking Vulkan context (to match other GPU…
Nov 6, 2024
0951099
Trigger CI for testing
Nov 7, 2024
9774414
Merge branch 'main' into dg/reduce-vulkan-desc-sets
Nov 8, 2024
afb8cc6
Reduce memory pressure by reducing default minimum block size to 4KB …
Nov 8, 2024
5671953
Clang format pass
Nov 8, 2024
cc0b5e2
Modify Vulkan allocation routine to fallback to any valid memory type…
Nov 16, 2024
7bdbafe
Re-enable Vulkan for entire GPU allocation test.
Nov 16, 2024
104a8e0
Fix debug messages to iterate across dimensions
Nov 20, 2024
8593273
Merge branch 'main' of github.com:halide/Halide
Nov 20, 2024
d302f71
Merge branch 'main' into dg/reduce-vulkan-desc-sets
Nov 20, 2024
0c937fc
Merge branch 'main' of github.com:halide/Halide
Nov 20, 2024
b308e36
Merge branch 'main' into dg/reduce-vulkan-desc-sets
Nov 20, 2024
a2df3cb
Avoid calling Vulkan API methods in module destructor, since some dri…
Nov 21, 2024
5721404
Update object lifetime and leak tests to manually invoke the Vulkan d…
Nov 21, 2024
e4f4896
Refactor compilation cache destruction to use C-ABI methods rather th…
Nov 22, 2024
b081a45
Don't return an error if the instance pointer is invalid, since it in…
Nov 22, 2024
a1031b9
Clang tidy pass
Nov 22, 2024
56560ad
Fix CodeGen for small signed integers (need to be sign extended and p…
Nov 22, 2024
a94d535
Remove stale mini_vulkan.h and use official Vulkan headers (setup the…
Nov 26, 2024
2273def
Ignore dependencies/vulkan for clang format
Nov 26, 2024
b354b55
Use -Isystem path for Vulkan Headers
Nov 26, 2024
ca967cc
Clang tidy cleanup pass
Nov 26, 2024
dbc5f81
Fix formatting
Nov 26, 2024
36ff6f5
Fix makefile build (add missing runtime include path for vulkan heade…
Nov 26, 2024
b37116f
Use spaces not tabs
Nov 26, 2024
a59948c
Use cmake's find_package to locate the Vulkan Headers, defaulting to …
Nov 26, 2024
74b6eed
Add "vulkan-headers" to vcpkg.json. Update builtin-baseline to d567b…
Nov 26, 2024
43470cb
Split Vulkan API function pointers into three groups ... loader, inst…
Nov 26, 2024
5bd7b41
Clang format pass
Nov 26, 2024
bf0bdda
Remove manual destruction in object lifetime tests for Vulkan
Nov 26, 2024
f528e64
Remove trailing comment in array
Nov 26, 2024
54d9450
Remove leftover print
Nov 26, 2024
75897c6
Merge branch 'main' of github.com:halide/Halide
Nov 27, 2024
3f6a6ba
Merge branch 'main' into dg/reduce-vulkan-desc-sets
Nov 27, 2024
92241df
Add optional custom dtor to JITModuleContents to allow JIT-only destr…
Nov 28, 2024
2b4d2d9
Trigger build to test latest drivers
Dec 2, 2024
4660552
Safeguard Vulkan destructor against being called if runtime wasn't in…
Dec 3, 2024
e040eeb
Formatting pass.
Dec 3, 2024
58148a9
Disable custom JIT destructor to see if segfaults still occur outside…
Dec 4, 2024
4d317d9
Test module destructor with latest driver.
Dec 4, 2024
c77b25d
Formatting pass
Dec 4, 2024
33dc2ff
Only disable CUDA and OpenCL
Dec 4, 2024
d1b4b7c
Revert CMakeLists ... disabling targets prematurely fails the buildbo…
Dec 4, 2024
2e2d837
Add VK_EXT_DEBUG_UTILS_EXTENSION_NAME to optional instance extensions.
Dec 4, 2024
115d7e4
Make clang tidy happy
Dec 5, 2024
06d0d35
Format & Tidy
Dec 5, 2024
6c1ed30
Revert changes to JITModule which added a custom dtor. Use standard
Dec 9, 2024
6ad058b
Revert changes to HalideRuntimeVulkan that added a custom dtor.
Dec 9, 2024
f956f40
Re-enable GPU object lifetime management tests and leak device tests for
Dec 9, 2024
d180921
Formatting
Dec 9, 2024
ee3c3d9
Merge branch 'main' into dg/reduce-vulkan-desc-sets
Dec 9, 2024
12c3a49
trigger buildbots
steven-johnson Dec 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
211 changes: 177 additions & 34 deletions src/CodeGen_Vulkan_Dev.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#include <algorithm>
#include <fstream> // for dump to file
#include <filesystem> // for dump to file
#include <fstream> // for dump to file
#include <sstream>
#include <unordered_set>

Expand Down Expand Up @@ -114,9 +115,9 @@ class CodeGen_Vulkan_Dev : public CodeGen_GPU_Dev {

// Top-level function for adding kernels
void add_kernel(const Stmt &s, const std::string &name, const std::vector<DeviceArgument> &args);
void init_module();
void compile(std::vector<char> &binary);
void dump() const;
void init_spirv_module();
void encode_spirv_module(std::vector<char> &binary);
void dump_spirv_module() const;

// Encode the descriptor sets into a sidecar which will be added
// as a header to the module prior to the actual SPIR-V binary
Expand Down Expand Up @@ -282,10 +283,32 @@ class CodeGen_Vulkan_Dev : public CodeGen_GPU_Dev {

// Target for codegen
Target target;
};

} emitter;
// Current target for codegen
Target current_target;

// Current kernel name for CodeGen
std::string current_kernel_name;

// In order to avoid using up all descriptor sets for complicated pipelines,
// we will encode each Kernel entry-point into it's own SPIR-V module and
// bind them as separate shaders to avoid running out of resources on
// constrained devices.
struct KernelModule {
std::string kernel_name;
std::vector<char> spirv_module; // header + binary
};
using KernelModuleTable = std::vector<KernelModule>;
KernelModuleTable kernel_module_table;

// merge the contents of the kernel module table into a single binary
// containing a header followed by the SPIR-V modules for each kernel
void encode_module(std::vector<char> &module);

// dump the contents of the combined module, outputting separate
// SPIR-V binary files for each kernel
void dump_module(const std::vector<char> &module);
};

// Check if all loads and stores to the member 'buffer' are dense, aligned, and
Expand Down Expand Up @@ -2244,7 +2267,7 @@ void CodeGen_Vulkan_Dev::SPIRV_Emitter::reset() {
reset_workgroup_size();
}

void CodeGen_Vulkan_Dev::SPIRV_Emitter::init_module() {
void CodeGen_Vulkan_Dev::SPIRV_Emitter::init_spirv_module() {
reset();

if (target.has_feature(Target::VulkanV13)) {
Expand Down Expand Up @@ -2287,6 +2310,11 @@ std::vector<char> encode_header_string(const std::string &str) {
void CodeGen_Vulkan_Dev::SPIRV_Emitter::encode_header(SpvBinary &spirv_header) {
debug(2) << "CodeGen_Vulkan_Dev::SPIRV_Emitter::encode_header\n";

// NOTE: This header is a pre-amble to the actual SPIR-V module that's passed
// to the driver. It describes the descriptor sets and other metadata
// that's necessary to bind the compiled shader. This is combined with
// the actual SPIR-V module to form a "kernel module".
//
// Encode a sidecar for the module that lists the descriptor sets
// corresponding to each entry point contained in the module.
//
Expand Down Expand Up @@ -2724,8 +2752,8 @@ void CodeGen_Vulkan_Dev::SPIRV_Emitter::declare_device_args(const Stmt &s, uint3
descriptor_set_table.push_back(descriptor_set);
}

void CodeGen_Vulkan_Dev::SPIRV_Emitter::compile(std::vector<char> &module) {
debug(2) << "CodeGen_Vulkan_Dev::SPIRV_Emitter::compile\n";
void CodeGen_Vulkan_Dev::SPIRV_Emitter::encode_spirv_module(std::vector<char> &module) {
debug(2) << "CodeGen_Vulkan_Dev::SPIRV_Emitter::encode_spirv_module\n";

// First encode the descriptor set bindings for each entry point
// as a sidecar which we will add as a preamble header to the actual
Expand Down Expand Up @@ -2826,19 +2854,19 @@ void CodeGen_Vulkan_Dev::SPIRV_Emitter::add_kernel(const Stmt &s,
storage_access_map.clear();
}

void CodeGen_Vulkan_Dev::SPIRV_Emitter::dump() const {
debug(2) << "CodeGen_Vulkan_Dev::SPIRV_Emitter::dump()\n";
void CodeGen_Vulkan_Dev::SPIRV_Emitter::dump_spirv_module() const {
debug(2) << "CodeGen_Vulkan_Dev::SPIRV_Emitter::dump_spirv_module()\n";
std::cerr << builder.current_module();
}

CodeGen_Vulkan_Dev::CodeGen_Vulkan_Dev(Target t)
: emitter(t) {
: current_target(t) {
// Empty
}

void CodeGen_Vulkan_Dev::init_module() {
debug(2) << "CodeGen_Vulkan_Dev::init_module\n";
emitter.init_module();
kernel_module_table.clear();
}

void CodeGen_Vulkan_Dev::add_kernel(Stmt stmt,
Expand All @@ -2854,21 +2882,150 @@ void CodeGen_Vulkan_Dev::add_kernel(Stmt stmt,
<< stmt;

current_kernel_name = name;

// Create the SPIR-V emitter
SPIRV_Emitter emitter(current_target);
emitter.init_spirv_module();
emitter.add_kernel(stmt, name, args);

// dump the SPIRV file if requested
if (getenv("HL_SPIRV_DUMP_FILE")) {
dump();
// Encode the SPIR-V module (header + binary)
std::vector<char> spirv_module;
emitter.encode_spirv_module(spirv_module);

// Dump the SPIR-V if debug is enabled
if (debug::debug_level() >= 2) {
emitter.dump_spirv_module();
}

// Copy the SPIR-V module into the Kernel Module table
KernelModule kernel_module;
kernel_module.kernel_name = name;
kernel_module.spirv_module = spirv_module;
kernel_module_table.push_back(kernel_module);
}

std::vector<char> CodeGen_Vulkan_Dev::compile_to_src() {
debug(2) << "CodeGen_Vulkan_Dev::compile_to_src\n";

std::vector<char> module;
emitter.compile(module);
encode_module(module);

// dump the SPIRV file if requested
if (getenv("HL_SPIRV_DUMP_FILE")) {
dump_module(module);
}

return module;
}

void CodeGen_Vulkan_Dev::encode_module(std::vector<char> &module) {

debug(2) << "CodeGen_Vulkan_Dev::encode_module\n";

// NOTE: The module generated by this method is an amalgamation of all the
// "kernel modules". It consists of a simple header indicating the
// number of "kernel modules" and their respective sizes, followed by
// a "SPIR-V module" for each kernel. Each "SPIR-V module" contains a
// pre-amble header describing the descriptor sets and metadata necessary
// for binding the shader, followed by the actual SPIR-V binary that's
// submitted to the driver.
//
// Previously, we encoded everything into a single "SPIR-V module" with
// entry-points for each kernel in the pipeline, but for complex pipelines,
// this could easily exceed the number of available descriptor sets available
// at runtime on resource constrained devices. As a compromise, we only
// generate a single SPIR-V module per entry-point to reduce the number
// of required descriptor sets needed to bind the shader. However, this
// requires us to merge them into an amalgation, which is why this two-level
// encoding exists.
//
//
// [0] Number of Kernel Modules (uint32_t)
// ... For each kernel module ...
// ... [0] Byte count indicating size of "Kernel Module" entry (uint32_t)
// [1] Kernel Module Table
// ... For each kernel module ...
// ....[0] The SPIR-V Module for the kernel
// ....... (see encode_spirv_module() for details)
//
// NOTE: Halide's Vulkan runtime consumes this module prior to compiling.
//
// Both vk_decode_shader_bindings() and vk_compile_shader_module() will
// need to be updated if this encoding ever changes!
//

// Nothing to do if table is empty
if (kernel_module_table.empty()) {
return;
}

// Encode a module header consisting of the number of kernels, followed by the binary size of each
size_t binary_bytes = 0;
SpvBinary module_header;
uint32_t kernel_count = (uint32_t)kernel_module_table.size();
debug(1) << " kernel_count = " << kernel_count << "\n";

module_header.push_back(kernel_count);
uint32_t n = 0;
for (const KernelModule &kernel_module : kernel_module_table) {
uint32_t spirv_module_size = (uint32_t)kernel_module.spirv_module.size();
debug(1) << " spirv_module_size[" << n++ << "] = " << spirv_module_size << " bytes\n";
module_header.push_back(spirv_module_size);
binary_bytes += spirv_module_size;
}

size_t header_bytes = module_header.size() * sizeof(uint32_t);

debug(2) << " encoding module ("
<< "header_size: " << (uint32_t)(header_bytes) << ", "
<< "binary_size: " << (uint32_t)(binary_bytes) << ")\n";

// Combine the header and each kernel module into the binary we will be returning
module.reserve(header_bytes + binary_bytes);
module.insert(module.end(), (const char *)module_header.data(), (const char *)(module_header.data() + module_header.size()));
for (const KernelModule &kernel_module : kernel_module_table) {
module.insert(module.end(), (const char *)kernel_module.spirv_module.data(), (const char *)(kernel_module.spirv_module.data() + kernel_module.spirv_module.size()));
}
}

void CodeGen_Vulkan_Dev::dump_module(const std::vector<char> &module) {
// Get the dump file name from the env (default to out.spv if unspecified)
const char *dump_file = getenv("HL_SPIRV_DUMP_FILE") ? getenv("HL_SPIRV_DUMP_FILE") : "out.spv";
std::filesystem::path dump_file_path(dump_file);

// Determine the number of kernels and their binary sizes
uint32_t word_offset = 0;
const uint32_t *module_header = (const uint32_t *)(module.data());
uint32_t kernel_count = module_header[word_offset++];
std::vector<uint32_t> binary_sizes;
for (uint32_t i = 0; (i < kernel_count) && ((word_offset * sizeof(uint32_t)) < module.size()); ++i) {
binary_sizes.push_back(module_header[word_offset++]);
}

// Dump the SPIR-V binary for each kernel as a separate file
size_t byte_offset = word_offset * sizeof(uint32_t);
for (uint32_t i = 0; (i < kernel_count) && (byte_offset < module.size()); ++i) {

// Skip the header and only output the SPIR-V binary for the kernel
const uint32_t *decode = (const uint32_t *)(module.data() + byte_offset);
uint32_t header_word_count = decode[0];
size_t header_size = header_word_count * sizeof(uint32_t);
const uint32_t *spirv_ptr = (decode + header_word_count);
size_t spirv_size = (binary_sizes[i] - header_size);

// Add the kernel index to the dump filename
std::string dump_kernel_file = dump_file_path.stem().string();
dump_kernel_file += "_k" + std::to_string(i) + dump_file_path.extension().string();

debug(1) << "Vulkan: Dumping SPIRV module to file: '" << dump_kernel_file << "'\n";
std::ofstream f(dump_kernel_file.c_str(), std::ios::out | std::ios::binary);
f.write((const char *)(spirv_ptr), spirv_size);
f.close();

byte_offset += binary_sizes[i];
}
}

std::string CodeGen_Vulkan_Dev::get_current_kernel_name() {
return current_kernel_name;
}
Expand All @@ -2878,23 +3035,9 @@ std::string CodeGen_Vulkan_Dev::print_gpu_name(const std::string &name) {
}

void CodeGen_Vulkan_Dev::dump() {
std::vector<char> module = compile_to_src();

// Print the contents of the compiled SPIR-V module
emitter.dump();

// Skip the header and only output the SPIR-V binary
const uint32_t *decode = (const uint32_t *)(module.data());
uint32_t header_word_count = decode[0];
size_t header_size = header_word_count * sizeof(uint32_t);
const uint32_t *binary_ptr = (decode + header_word_count);
size_t binary_size = (module.size() - header_size);

const char *filename = getenv("HL_SPIRV_DUMP_FILE") ? getenv("HL_SPIRV_DUMP_FILE") : "out.spv";
debug(1) << "Vulkan: Dumping SPIRV module to file: '" << filename << "'\n";
std::ofstream f(filename, std::ios::out | std::ios::binary);
f.write((const char *)(binary_ptr), binary_size);
f.close();
std::vector<char> module;
encode_module(module);
dump_module(module);
}

} // namespace
Expand All @@ -2918,4 +3061,4 @@ std::unique_ptr<CodeGen_GPU_Dev> new_CodeGen_Vulkan_Dev(const Target &target) {
} // namespace Internal
} // namespace Halide

#endif // WITH_SPIRV
#endif // WITH_SPIRV
Loading
Loading