Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIRRTL] Output directory control for layers and modules #6971

Merged
merged 13 commits into from
Jun 14, 2024

Conversation

rwy7
Copy link
Contributor

@rwy7 rwy7 commented Apr 30, 2024

Add output directory control for layers and firrtl.

For specifying the output directories:

  • Add a new annotation called "circt.OutputDirAnnotation", for specifying the output directory of modules
  • For layers, add syntax in the firrtl surface language for optionally specifying the output directory. We can't use an annotation here because, annotations can't target layers.

In the Lower-Layers Pass:

  • Place layer collateral (the bindfile, any layerblocks->modules) under the output directory of the layer, if there is any.
  • Stop outputting layer collateral under the testbench/views

Add a new pass, AssignOutputDirs, which will sink modules into the output directories they are instantiated from. This pass runs after lower-layers. In conjunction with the changes to the lower-layers pass, this means that modules which are only used under a particular layer will be sunk into that layer's output directory.

@uenoku
Copy link
Member

uenoku commented May 1, 2024

What is an expected behavior when modules with different output dir annotations are deduplicated?

@rwy7 rwy7 marked this pull request as ready for review May 6, 2024 18:53
@rwy7
Copy link
Contributor Author

rwy7 commented May 6, 2024

What is an expected behavior when modules with different output dir annotations are deduplicated?

Hey, thanks for taking a look! Two modules will dedup iff they have the same output directory annotation (modulo dirname canonicalization when constructing the the hw::OutputFileAttr). I've added a test for this here.

Copy link
Member

@seldridge seldridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LCA algorithm comes across as more complicated than I think it should be. It may be easier to adopt a naive prefix LCA (as that is very understandable) or to go the direction of naive Euler Tour + RMQ (which should be recognizable and has room for performance improvements later if needed). This does appear to be doing a lot of LCA queries hence the RMQ formulation may make sense

Comments throughout.

docs/Dialects/FIRRTL/FIRRTLAnnotations.md Outdated Show resolved Hide resolved
docs/Dialects/FIRRTL/FIRRTLAnnotations.md Outdated Show resolved Hide resolved
docs/Dialects/FIRRTL/FIRRTLAnnotations.md Outdated Show resolved Hide resolved
include/circt/Dialect/FIRRTL/Passes.td Outdated Show resolved Hide resolved
include/circt/Dialect/FIRRTL/Passes.td Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/LowerLayers.cpp Outdated Show resolved Hide resolved
test/Dialect/FIRRTL/annotations.mlir Show resolved Hide resolved
test/Dialect/FIRRTL/assign-output-dirs.mlir Outdated Show resolved Hide resolved
test/Dialect/FIRRTL/assign-output-dirs.mlir Outdated Show resolved Hide resolved
@@ -0,0 +1,179 @@
; RUN: firtool --split-input-file %s | FileCheck %s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test would be tighter as MLIR and running only on dedup. This allows for violations of initialization checking, i.e., the test can have modules that are basically empty. Alternatively, firtool Foo.fir -parse-only | circt-opt -firrtl-dedup gives you FIRRTL syntax for the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is this is testing that firtool -- whatever its pipeline is -- has the expected (or at least consistent, as it's under test) behavior from a user's experience regarding interaction of dedup + output_file (in terms exposed to user, not when/how/where we set what attributes internally).

For example, as-is, moving the output-dir-assignment pass before dedup -- while not changing dedup's behavior -- would cause test failures (that a test narrowed to dedup would not catch). Specifically regarding the last test in this file (and lower-layers.fir test too, apparently).

And is this indeed the intended behavior / semantics for "output dir" information end-to-end?

Would be good to pin down what is really expected/promised by using these knobs/annotations -- to inform the behavior of dedup or anything else that alters instance hierarchy in a way that changes what-goes-where (inlining, for example -- what if you inline a module with an output_file attribute on it (especially modules that it was the only parent of)?).

How'd y'all frame the intent here -- best-effort output directives, providing users ability to get their desired output even if not specifically promised in a particular way re:pipeline happenstance (but they could use more annotations / etc/ to change what happens if they don't like it?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this test again, this test is very verbose in order to get dedup to work. I am fine to have it be end-to-end. However, I would change it to not have so much logic in every module. Instead, the modules only need a don't touch'd wire (of the same size if you want them to dedup or a different size if not).

Comment on lines 418 to 421
if (auto moduleOp = dyn_cast<FModuleOp>(op)) {
moduleOp->setAttr("output_file", outputFile);
return success();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This silently drops attributes if (mistakenly) multiple output dir annotations are attached to the module. Can we check and emit an error?

seldridge
seldridge previously approved these changes Jun 7, 2024
Copy link
Member

@seldridge seldridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks fine.

I do realize that we made a mistake in the representation of the output directory structure and that this should be both: (1) in the FIRRTL spec and (2) sunk into the IR directly. This avoids the entirety of the OutputDirInfo which is having to be created to track information that would be better represented in a symbol table.

When an output directory isn't explicitly declared, then its parent directory is
implicitly the default output directory. To explicitly declare that a
directory's parent is the default output directory, use an empty string as the
parent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any problem that this allows the description of arbitrary directed, cyclic graphs as opposed to only trees?

I don't think there's a problem with acyclic graphs in terms of the LCA computation, though it is slightly more complicated as it needs to consider depth and there may exist more than one "LCA".

Consider the following precedence graph:
image

tl;dr: This should probably require that this is a tree.

In a follow-on, it may be good to formalize this into the FIRRTL spec using a similar structure like the layer declarations to encode the explicit tree:

circuit Foo:
  directory Foo "foo/": 
    directory Bar "bar/"
  directory Baz "baz/" 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later language in the PR calls this a "tree". Nit: it would be good to be consistent throughout comments and docs that this is a tree if it must be a tree and not a graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure I am interpreting this comment right

Is there any problem that this allows the description of arbitrary directed, cyclic graphs as opposed to only trees?

Regarding cycles, and multiple parents, there are checks in the AssignOutputDirs pass that ensure the precedence "graph/tree" is acyclic and that every directory has a single parent declaration.

Regarding multiple roots in the precedence graph/tree, that's not possible either, the graph/tree has a single root, the "default output directory".

Are you suggesting I rework the precedence annotations, so that it can only express a tree by construction? I am open to suggestions, but I chose this representation so that there wouldn't have to be a single monolithic annotation declaring all output directories in a single shot.

I will s/graph/tree/ in the PR for clarity.

lib/Dialect/FIRRTL/Export/FIREmitter.cpp Show resolved Hide resolved
test/Dialect/FIRRTL/parse-basic.fir Show resolved Hide resolved
Comment on lines 101 to 111
auto nameField = anno.getMember<StringAttr>("name");
if (!nameField)
return err() << "output directory declaration missing name";
if (nameField.empty())
return err() << "output directory name cannot be empty";
auto name = canonicalize(nameField);

auto parentField = anno.getMember<StringAttr>("parent");
if (!parentField)
return err() << "output directory declaration missing parent";
auto parent = canonicalize(parentField);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code, coupled with the behavior of the canonicalize function are a bit weird. The checking of the inputs to canonicalize are done here to produce errors. Yet, this is then checked again by canonicalize only to return values which are not checked.

Also, the separation of canonicalize(StringRef) from canonicalize(StringAttr) isn't necessary given that only the latter is used in this file.

Several other better factorings exist:

  1. Move the error generation logic into canonicalize, check the return on canoncalize and just exit.
  2. Move only the checking into canonicalize and keep the error messages where they are. This requires combining the error messages of !nameField and nameField.empty().
  3. Keep the current structure, but do not check in canonicalize. This may motivate making canonicalize a lambda so that it is co-located.

Making canonicalize a lambda may be better, too, given the single use and, for (1) or (2) the need to capture the circuit for the error message.

lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
Comment on lines 133 to 141
StringAttr OutputFileAttr::getDirectoryAttr() {
if (isDirectory())
return getFilename();

auto dir = getFilename().getValue();
for (unsigned i = 0, e = dir.size(); i < e; ++i) {
if (dir.ends_with(llvm::sys::path::get_separator()))
break;
dir = dir.drop_back();
}

if (dir.empty())
return nullptr;

return StringAttr::get(getContext(), dir);
}
Copy link
Member

@seldridge seldridge Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it is doing: llvm::sys::path::get_parent_path?

test/firtool/lower-layers.fir Outdated Show resolved Hide resolved
@@ -0,0 +1,179 @@
; RUN: firtool --split-input-file %s | FileCheck %s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this test again, this test is very verbose in order to get dedup to work. I am fine to have it be end-to-end. However, I would change it to not have so much logic in every module. Instead, the modules only need a don't touch'd wire (of the same size if you want them to dedup or a different size if not).

Copy link
Contributor

@darthscsi darthscsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like we are adding lots of complexity and I'm not sure we really need it.
These seem obvious and would be a good stand-alone PR:
a) all relative paths should be relative to the output directory (-o flag)
b) Modules without outputs set go to the output directory
c) Layers without outputs set go to the output directory
d) Layers extracted to a module inherit the output directory of the layer.
e) Anything for which the user explicitly specifies a directory has that directory respected.

The key 2 question are:
a) whether modules are placed in directories based on the elaboration (instantiation) tree. If so, it seems the Least common ancestor is the natural place for that.
b) were public modules go. Do they move based on instantiation graph or always go in the output dir. What if they specify a location?

For a, saying LCA seems fine.
For b, either seems fine.
For both, I think anything which explicitly specifies a directory must have that respected.

What I think should be a non-goal is making directory nesting encode layer dependency despite what the user specifies for output directory of layers.

@seldridge
Copy link
Member

seldridge commented Jun 7, 2024

☝️ After discussing with @darthscsi more about this. This comment is saying that we can remove the OutputDirPrecedenceAnnotation and any logic associated with it and instead compute the LCA based on the LCA of the directories themselves. E.g., SimpleLCA("verilog/design/", "verilog/testbench/") == "verilog/". This cannot express certain things like we do internally where AnnoyingLCA("verilog/design/", "verilog/testbench/") == "verilog/design/"). However, we can change this internally so that we can use the SimpleLCA computation.

Edit: We may be able to get the same behavior as we have today for SimpleLCA by always choosing the provided output directory ("verilog/design/") if the LCA is above the provided output directory.

@seldridge seldridge dismissed their stale review June 7, 2024 20:35

This needs more discussion based on Lenharth feedback.

@rwy7 rwy7 force-pushed the output-control branch 2 times, most recently from 9a329db to ea8443c Compare June 11, 2024 18:51
docs/Dialects/FIRRTL/FIRRTLAnnotations.md Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
Comment on lines 670 to 671
auto &body = layer.getBody().getBlocks().front();
if (!body.empty()) {
auto begin = body.op_begin<LayerOp>();
auto end = body.op_end<LayerOp>();
path.emplace_back(FlatSymbolRefAttr::get(layer.getSymNameAttr()));
stack.emplace_back(begin, end);
}

while (!stack.empty() && idx(stack.back()) == end(stack.back())) {
stack.pop_back();
path.pop_back();
}

if (stack.empty())
break;

assert(idx(stack.back()) != end(stack.back()));
layer = *idx(stack.back());
++idx(stack.back());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like overkill. Can't this just lookup the layer by symbol and see if it has an output file attribute when it needs it? Alternatively, this could be cached, though isn't that just a walk over LayerOp?

Copy link
Member

@seldridge seldridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for the iteration on this, Rob. I think you've arrived at an excellent, minimal, and understandable design point. 💯

Comment on lines +34 to +35
fs::make_absolute(outputDir, moduleOutputDir);
path::remove_dots(moduleOutputDir, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be the correct way to canonicalize this. 👍 It's kind of surprising that there isn't an llvm::canonicalizePath which doesn't try to use the discovered current directory as the actual current directory.

llvm::sys::fs::real_path could be a one-stop-shop for this if we ever wanted tilde expansion and symlink resolution, too.

lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/LowerLayers.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/LowerLayers.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/LowerLayers.cpp Outdated Show resolved Hide resolved
test/Dialect/FIRRTL/annotations-errors.mlir Outdated Show resolved Hide resolved
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}layers_Testbench_A.sv"
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}Bar.sv"
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}DUT_A.sv"
; CHECK-DAG: FILE "verification{{[/\]}}testbench{{[/\]}}Testbench.sv"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This nicely matches the spirit of the original test.

Comment on lines +49 to +59
firrtl.module @InA() attributes {output_file = #hw.output_file<"A/foo">} {
firrtl.instance ra @ByRA()
firrtl.instance ab @ByAB()
firrtl.instance a @ByA()
firrtl.instance ac @ByAC()
}

firrtl.module @InB() attributes {output_file = #hw.output_file<"B/foo">} {
firrtl.instance ab @ByAB()
firrtl.instance bc @ByBC()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great edge case to test!

rwy7 added 5 commits June 14, 2024 12:21
This helper gets the directory component of an output file name, or returns
nullptr if there is none.
Instead of using an explicit precedence declaration anno to help guide the
assignment of floating modules to output directories, use the directory
hierarchy itself.  So if a module is used under directory A/B and A/C, it will
be placed into directory A.
Copy link
Contributor

@dtzSiFive dtzSiFive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing on this!!! Small feedback, generally LGTM! 🎉

lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
lib/Dialect/FIRRTL/Transforms/AssignOutputDirs.cpp Outdated Show resolved Hide resolved
@rwy7 rwy7 merged commit 41ebd04 into llvm:main Jun 14, 2024
4 checks passed
@rwy7 rwy7 deleted the output-control branch June 14, 2024 18:51
@rwy7
Copy link
Contributor Author

rwy7 commented Jun 14, 2024

Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants