[FIRRTL] Dedup: speed up handling of instances #7815

youngar · 2024-11-14T06:45:18Z

Dedup tries to hash all modules in parallel. To accomplish this, the names of instantiated modules are not included as part of the structural hash, but they are taken in to account when checking if two modules are the same. This process involves comparing the instantiated children modules of two modules if their hashes match. This was implemented by using an array attribute, to make comparisons quicker.

When a module or class has many thousands of instances underneath it, it becomes impractical to build a array attribute with every child module. Interning such a large ArrayAttr is incredibly slow and will eat up that memory for the rest of the process.

Instead, we don't bother interning the instance arrays, and just keep them as plain old vectors, which comes with the benefit of not eagerly interning gigantic arrays.

Dedup tries to hash all modules in parallel. To accomplish this, the names of instantiated modules are not included as part of the structural hash, but they are taken in to account when checking if two modules are the same. This process involves comparing the instantiated children modules of two modules if their hashes match. This was implemented by using an array attribute, to make comparisons quicker. When a module or class has many thousands of instances underneath it, it becomes impractical to build a array attribute with every child module. Interning such a large ArrayAttr is incredibly slow and will eat up that memory for the rest of the process. Instead, we don't bother interning the instance arrays, and just keep them as plain old vectors, which comes with the benefit of not eagerly interning gigantic arrays.

uenoku · 2024-11-14T06:56:15Z

lib/Dialect/FIRRTL/Transforms/Dedup.cpp

 };

+static bool operator==(const ModuleInfo &lhs, const ModuleInfo &rhs) {
+  return lhs.structuralHash == rhs.structuralHash &&
+         lhs.referredModuleNames == rhs.referredModuleNames;


Is there any chance this vector comparison rather expensive than interning them as ArrayAttr? Not sure how common it is but if there are modules that has same hash and instances within them refer to different modules, it could be possible that this comparison occurs multiple times.

Yeah, its possible it does the comparison multiple times. I was thinking that the same comparison problem could happen when creating the ArrayAttr by interning the vector in a DenseSet, so its probably similar in the worst case of both. I just noticed that StorageUniquer stores a key's hash as part of the key, which seems like it could help reduce expensive comparisons 🤔

uenoku

One question on the cost of comparison but the change makes sense, thanks!

darthscsi · 2024-11-14T16:18:08Z

lib/Dialect/FIRRTL/Transforms/Dedup.cpp

@@ -89,9 +90,14 @@ struct ModuleInfo {
  // SHA256 hash.
  std::array<uint8_t, 32> structuralHash;
  // Module names referred by instance op in the module.
-  mlir::ArrayAttr referredModuleNames;
+  std::vector<StringAttr> referredModuleNames;


SmallVector should be fine for these.

darthscsi · 2024-11-14T16:18:31Z

lib/Dialect/FIRRTL/Transforms/Dedup.cpp

@@ -359,7 +363,7 @@ struct StructuralHasher {
  DenseMap<StringAttr, SymbolTarget> innerSymTargets;

  // This keeps track of module names in the order of the appearance.
-  SmallVector<mlir::StringAttr> referredModuleNames;
+  std::vector<StringAttr> referredModuleNames;


SmallVector should still be fine for these.

youngar added the FIRRTL Involving the `firrtl` dialect label Nov 14, 2024

youngar requested review from darthscsi and seldridge as code owners November 14, 2024 06:45

uenoku reviewed Nov 14, 2024

View reviewed changes

uenoku approved these changes Nov 14, 2024

View reviewed changes

youngar merged commit bf43ca2 into llvm:main Nov 14, 2024
4 checks passed

youngar deleted the firrtl-dedup-referred-instances branch November 14, 2024 07:51

darthscsi reviewed Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIRRTL] Dedup: speed up handling of instances #7815

[FIRRTL] Dedup: speed up handling of instances #7815

youngar commented Nov 14, 2024

uenoku Nov 14, 2024

youngar Nov 14, 2024 •

edited

Loading

uenoku left a comment

darthscsi Nov 14, 2024

darthscsi Nov 14, 2024

[FIRRTL] Dedup: speed up handling of instances #7815

[FIRRTL] Dedup: speed up handling of instances #7815

Conversation

youngar commented Nov 14, 2024

uenoku Nov 14, 2024

Choose a reason for hiding this comment

youngar Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

uenoku left a comment

Choose a reason for hiding this comment

darthscsi Nov 14, 2024

Choose a reason for hiding this comment

darthscsi Nov 14, 2024

Choose a reason for hiding this comment

youngar Nov 14, 2024 •

edited

Loading