-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIRRTL] Output directory control for layers and modules #6971
Conversation
What is an expected behavior when modules with different output dir annotations are deduplicated? |
Hey, thanks for taking a look! Two modules will dedup iff they have the same output directory annotation (modulo dirname canonicalization when constructing the the hw::OutputFileAttr). I've added a test for this here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LCA algorithm comes across as more complicated than I think it should be. It may be easier to adopt a naive prefix LCA (as that is very understandable) or to go the direction of naive Euler Tour + RMQ (which should be recognizable and has room for performance improvements later if needed). This does appear to be doing a lot of LCA queries hence the RMQ formulation may make sense
Comments throughout.
@@ -0,0 +1,179 @@ | |||
; RUN: firtool --split-input-file %s | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test would be tighter as MLIR and running only on dedup. This allows for violations of initialization checking, i.e., the test can have modules that are basically empty. Alternatively, firtool Foo.fir -parse-only | circt-opt -firrtl-dedup
gives you FIRRTL syntax for the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is this is testing that firtool -- whatever its pipeline is -- has the expected (or at least consistent, as it's under test) behavior from a user's experience regarding interaction of dedup + output_file (in terms exposed to user, not when/how/where we set what attributes internally).
For example, as-is, moving the output-dir-assignment pass before dedup -- while not changing dedup's behavior -- would cause test failures (that a test narrowed to dedup
would not catch). Specifically regarding the last test in this file (and lower-layers.fir test too, apparently).
And is this indeed the intended behavior / semantics for "output dir" information end-to-end?
Would be good to pin down what is really expected/promised by using these knobs/annotations -- to inform the behavior of dedup or anything else that alters instance hierarchy in a way that changes what-goes-where (inlining, for example -- what if you inline a module with an output_file attribute on it (especially modules that it was the only parent of)?).
How'd y'all frame the intent here -- best-effort output directives, providing users ability to get their desired output even if not specifically promised in a particular way re:pipeline happenstance (but they could use more annotations / etc/ to change what happens if they don't like it?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this test again, this test is very verbose in order to get dedup to work. I am fine to have it be end-to-end. However, I would change it to not have so much logic in every module. Instead, the modules only need a don't touch'd wire (of the same size if you want them to dedup or a different size if not).
if (auto moduleOp = dyn_cast<FModuleOp>(op)) { | ||
moduleOp->setAttr("output_file", outputFile); | ||
return success(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This silently drops attributes if (mistakenly) multiple output dir annotations are attached to the module. Can we check and emit an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks fine.
I do realize that we made a mistake in the representation of the output directory structure and that this should be both: (1) in the FIRRTL spec and (2) sunk into the IR directly. This avoids the entirety of the OutputDirInfo
which is having to be created to track information that would be better represented in a symbol table.
When an output directory isn't explicitly declared, then its parent directory is | ||
implicitly the default output directory. To explicitly declare that a | ||
directory's parent is the default output directory, use an empty string as the | ||
parent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any problem that this allows the description of arbitrary directed, cyclic graphs as opposed to only trees?
I don't think there's a problem with acyclic graphs in terms of the LCA computation, though it is slightly more complicated as it needs to consider depth and there may exist more than one "LCA".
Consider the following precedence graph:
tl;dr: This should probably require that this is a tree.
In a follow-on, it may be good to formalize this into the FIRRTL spec using a similar structure like the layer
declarations to encode the explicit tree:
circuit Foo:
directory Foo "foo/":
directory Bar "bar/"
directory Baz "baz/"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later language in the PR calls this a "tree". Nit: it would be good to be consistent throughout comments and docs that this is a tree if it must be a tree and not a graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure I am interpreting this comment right
Is there any problem that this allows the description of arbitrary directed, cyclic graphs as opposed to only trees?
Regarding cycles, and multiple parents, there are checks in the AssignOutputDirs pass that ensure the precedence "graph/tree" is acyclic and that every directory has a single parent declaration.
Regarding multiple roots in the precedence graph/tree, that's not possible either, the graph/tree has a single root, the "default output directory".
Are you suggesting I rework the precedence annotations, so that it can only express a tree by construction? I am open to suggestions, but I chose this representation so that there wouldn't have to be a single monolithic annotation declaring all output directories in a single shot.
I will s/graph/tree/ in the PR for clarity.
auto nameField = anno.getMember<StringAttr>("name"); | ||
if (!nameField) | ||
return err() << "output directory declaration missing name"; | ||
if (nameField.empty()) | ||
return err() << "output directory name cannot be empty"; | ||
auto name = canonicalize(nameField); | ||
|
||
auto parentField = anno.getMember<StringAttr>("parent"); | ||
if (!parentField) | ||
return err() << "output directory declaration missing parent"; | ||
auto parent = canonicalize(parentField); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code, coupled with the behavior of the canonicalize
function are a bit weird. The checking of the inputs to canonicalize
are done here to produce errors. Yet, this is then checked again by canonicalize
only to return values which are not checked.
Also, the separation of canonicalize(StringRef)
from canonicalize(StringAttr)
isn't necessary given that only the latter is used in this file.
Several other better factorings exist:
- Move the error generation logic into
canonicalize
, check the return oncanoncalize
and just exit. - Move only the checking into
canonicalize
and keep the error messages where they are. This requires combining the error messages of!nameField
andnameField.empty()
. - Keep the current structure, but do not check in
canonicalize
. This may motivate makingcanonicalize
a lambda so that it is co-located.
Making canonicalize
a lambda may be better, too, given the single use and, for (1) or (2) the need to capture the circuit for the error message.
lib/Dialect/HW/HWAttributes.cpp
Outdated
StringAttr OutputFileAttr::getDirectoryAttr() { | ||
if (isDirectory()) | ||
return getFilename(); | ||
|
||
auto dir = getFilename().getValue(); | ||
for (unsigned i = 0, e = dir.size(); i < e; ++i) { | ||
if (dir.ends_with(llvm::sys::path::get_separator())) | ||
break; | ||
dir = dir.drop_back(); | ||
} | ||
|
||
if (dir.empty()) | ||
return nullptr; | ||
|
||
return StringAttr::get(getContext(), dir); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it is doing: llvm::sys::path::get_parent_path
?
@@ -0,0 +1,179 @@ | |||
; RUN: firtool --split-input-file %s | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this test again, this test is very verbose in order to get dedup to work. I am fine to have it be end-to-end. However, I would change it to not have so much logic in every module. Instead, the modules only need a don't touch'd wire (of the same size if you want them to dedup or a different size if not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like we are adding lots of complexity and I'm not sure we really need it.
These seem obvious and would be a good stand-alone PR:
a) all relative paths should be relative to the output directory (-o flag)
b) Modules without outputs set go to the output directory
c) Layers without outputs set go to the output directory
d) Layers extracted to a module inherit the output directory of the layer.
e) Anything for which the user explicitly specifies a directory has that directory respected.
The key 2 question are:
a) whether modules are placed in directories based on the elaboration (instantiation) tree. If so, it seems the Least common ancestor is the natural place for that.
b) were public modules go. Do they move based on instantiation graph or always go in the output dir. What if they specify a location?
For a, saying LCA seems fine.
For b, either seems fine.
For both, I think anything which explicitly specifies a directory must have that respected.
What I think should be a non-goal is making directory nesting encode layer dependency despite what the user specifies for output directory of layers.
☝️ After discussing with @darthscsi more about this. This comment is saying that we can remove the Edit: We may be able to get the same behavior as we have today for |
This needs more discussion based on Lenharth feedback.
9a329db
to
ea8443c
Compare
auto &body = layer.getBody().getBlocks().front(); | ||
if (!body.empty()) { | ||
auto begin = body.op_begin<LayerOp>(); | ||
auto end = body.op_end<LayerOp>(); | ||
path.emplace_back(FlatSymbolRefAttr::get(layer.getSymNameAttr())); | ||
stack.emplace_back(begin, end); | ||
} | ||
|
||
while (!stack.empty() && idx(stack.back()) == end(stack.back())) { | ||
stack.pop_back(); | ||
path.pop_back(); | ||
} | ||
|
||
if (stack.empty()) | ||
break; | ||
|
||
assert(idx(stack.back()) != end(stack.back())); | ||
layer = *idx(stack.back()); | ||
++idx(stack.back()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like overkill. Can't this just lookup the layer by symbol and see if it has an output file attribute when it needs it? Alternatively, this could be cached, though isn't that just a walk over LayerOp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the iteration on this, Rob. I think you've arrived at an excellent, minimal, and understandable design point. 💯
fs::make_absolute(outputDir, moduleOutputDir); | ||
path::remove_dots(moduleOutputDir, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to be the correct way to canonicalize this. 👍 It's kind of surprising that there isn't an llvm::canonicalizePath
which doesn't try to use the discovered current directory as the actual current directory.
llvm::sys::fs::real_path
could be a one-stop-shop for this if we ever wanted tilde expansion and symlink resolution, too.
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}layers_Testbench_A.sv" | ||
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}Bar.sv" | ||
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}DUT_A.sv" | ||
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}Testbench.sv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This nicely matches the spirit of the original test.
firrtl.module @InA() attributes {output_file = #hw.output_file<"A/foo">} { | ||
firrtl.instance ra @ByRA() | ||
firrtl.instance ab @ByAB() | ||
firrtl.instance a @ByA() | ||
firrtl.instance ac @ByAC() | ||
} | ||
|
||
firrtl.module @InB() attributes {output_file = #hw.output_file<"B/foo">} { | ||
firrtl.instance ab @ByAB() | ||
firrtl.instance bc @ByBC() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great edge case to test!
This helper gets the directory component of an output file name, or returns nullptr if there is none.
Instead of using an explicit precedence declaration anno to help guide the assignment of floating modules to output directories, use the directory hierarchy itself. So if a module is used under directory A/B and A/C, it will be placed into directory A.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pushing on this!!! Small feedback, generally LGTM! 🎉
Thanks everyone! |
Add output directory control for layers and firrtl.
For specifying the output directories:
In the Lower-Layers Pass:
Add a new pass, AssignOutputDirs, which will sink modules into the output directories they are instantiated from. This pass runs after lower-layers. In conjunction with the changes to the lower-layers pass, this means that modules which are only used under a particular layer will be sunk into that layer's output directory.