Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIRRTL] Dedup: speed up handling of instances #7815

Merged
merged 1 commit into from
Nov 14, 2024

Conversation

youngar
Copy link
Member

@youngar youngar commented Nov 14, 2024

Dedup tries to hash all modules in parallel. To accomplish this, the names of instantiated modules are not included as part of the structural hash, but they are taken in to account when checking if two modules are the same. This process involves comparing the instantiated children modules of two modules if their hashes match. This was implemented by using an array attribute, to make comparisons quicker.

When a module or class has many thousands of instances underneath it, it becomes impractical to build a array attribute with every child module. Interning such a large ArrayAttr is incredibly slow and will eat up that memory for the rest of the process.

Instead, we don't bother interning the instance arrays, and just keep them as plain old vectors, which comes with the benefit of not eagerly interning gigantic arrays.

Dedup tries to hash all modules in parallel.  To accomplish this, the
names of instantiated modules are not included as part of the structural
hash, but they are taken in to account when checking if two modules are
the same.  This process involves comparing the instantiated children
modules of two modules if their hashes match.  This was implemented by
using an array attribute, to make comparisons quicker.

When a module or class has many thousands of instances underneath it, it
becomes impractical to build a array attribute with every child module.
Interning such a large ArrayAttr is incredibly slow and will eat up that
memory for the rest of the process.

Instead, we don't bother interning the instance arrays, and just keep
them as plain old vectors, which comes with the benefit of not eagerly
interning gigantic arrays.
@youngar youngar added the FIRRTL Involving the `firrtl` dialect label Nov 14, 2024
};

static bool operator==(const ModuleInfo &lhs, const ModuleInfo &rhs) {
return lhs.structuralHash == rhs.structuralHash &&
lhs.referredModuleNames == rhs.referredModuleNames;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance this vector comparison rather expensive than interning them as ArrayAttr? Not sure how common it is but if there are modules that has same hash and instances within them refer to different modules, it could be possible that this comparison occurs multiple times.

Copy link
Member Author

@youngar youngar Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, its possible it does the comparison multiple times. I was thinking that the same comparison problem could happen when creating the ArrayAttr by interning the vector in a DenseSet, so its probably similar in the worst case of both. I just noticed that StorageUniquer stores a key's hash as part of the key, which seems like it could help reduce expensive comparisons 🤔

Copy link
Member

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question on the cost of comparison but the change makes sense, thanks!

@youngar youngar merged commit bf43ca2 into llvm:main Nov 14, 2024
4 checks passed
@youngar youngar deleted the firrtl-dedup-referred-instances branch November 14, 2024 07:51
@@ -89,9 +90,14 @@ struct ModuleInfo {
// SHA256 hash.
std::array<uint8_t, 32> structuralHash;
// Module names referred by instance op in the module.
mlir::ArrayAttr referredModuleNames;
std::vector<StringAttr> referredModuleNames;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SmallVector should be fine for these.

@@ -359,7 +363,7 @@ struct StructuralHasher {
DenseMap<StringAttr, SymbolTarget> innerSymTargets;

// This keeps track of module names in the order of the appearance.
SmallVector<mlir::StringAttr> referredModuleNames;
std::vector<StringAttr> referredModuleNames;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SmallVector should still be fine for these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FIRRTL Involving the `firrtl` dialect
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants