[RFC][Sim] Add format string type and format specifier ops #7208

fzi-hielscher · 2024-06-19T16:31:14Z

The motivation of this PR is to get a frontend/backend agnostic representation of "printf-like" format strings into the core dialects. These should allow us to specify the emission of messages during simulation, which include formatted representations of runtime values.

It adds the !sim.fstring type to the IR, which represents a sequence of format tokens. These tokens can either be string literals or the combination of a (SSA) value and a format specifier. It also adds the usual suspects of format specifiers as individual operations: Binary, signed/unsigned decimal and hexadecimal. To make the transition of the current direct FIRRTL-to-SV lowering as simple (and compatible) as possible, the specifics of those heavily lean on the way formatting works in SystemVerilog. Currently formatting is limited to integer values. For the details I'd ask you to refer to the documentation in the TableGen source.

I deliberately decided against a representation where substitutions are embedded within a string, e.g., "Foo = %x". While this imposes some efforts on the frontend lowerings to tokenize their format strings into separate operations , I think this will help us to avoid confusion between different fronend and backend formats. And more importantly, it makes processing substitutions in an "MLIRish way" easier.

These additions are intended as a first step towards a FIRRTL->Sim->Arc/LLVM-IR lowering pipeline forprintf statements.
I've got a functional prototype of this integration over in my repository, which can hopefully give you an impression on how this could look like in the end.

I have to point out that this is effectively a competing implementation to the existing print operation in the Verif dialect, added in #5616. I hope this does not disqualify me from the title of friendly neighbourhood compiler engineer. I'm currently not aware of any (publicly available) lowering to FormatVerilogStringOp, so I cannot judge how hard it would be to transition this to my implementation. But I am somewhat confident that we'll find a way to unify these.

I'd be happy to hear your opinions and suggestions on this approach.

fabianschuiki · 2024-07-02T00:13:45Z

I love the idea of actually breaking the formatting string up into separate ops that then get concatenated. Especially since it makes the format string syntax itself a frontend language concern, which gets lowered to a bunch of ops, which can then get emitted appropriately by the backend. The FIRRTL printing ops basically accept "whatever SV format string you want". With this, we could actually specify formatting strings in FIRRTL that are parsed and lowered.

This feels a lot like what Rust did with its format!(...) macro, where the compiler actually splits the format string up into something like LLVM's Twine. Love it 😃

fzi-hielscher · 2024-07-02T12:56:48Z

Thanks a lot, @fabianschuiki.

I'm still hesitant to push (partially) redundant infrastructure without being able to sketch out a clear path to unification. Maybe @mortbopet or @teqdruid could briefly chime in on their use of the current verif print operations? Basically, how important is it to be able to pass format strings verbatim from the middle-end to the SV back-end? Can we reasonably limit ourselves to a subset of the Verilog format specifiers?

darthscsi · 2024-07-03T18:09:22Z

@prithayan is sim.printf used internally?

teqdruid · 2024-07-03T18:22:38Z

I don't think we use the verif dialect.

prithayan · 2024-07-03T20:17:29Z

@prithayan is sim.printf used internally?

@darthscsi , We don't use the verif.print op internally.
The format string type looks good, would be useful for the sim.plusargs also.

fzi-hielscher · 2024-07-08T12:56:41Z

Thanks for the feedback. Since there are apparently no known users of the verif.print operation, I'd opt for the time-tested method of ripping them out and see if someone complains.

To be honest, using format strings for parsing (like sim.plusargs) was not on my radar. I'm not so optimistic that this design can be used for scanf-like operations. After all we cannot just reverse the dataflow of the formatting token operations. Then again, sim.plusargs looks to me like a very specific behavior that does not make much sense outside of a Verilog context. So, does it really make sense to try and abstract over this?

To make some progress here I would merge this PR in the coming hours, unless anyone has any objections. Fell free to leave a post merge review or voice your concerns in the inevitable follow-up PRs.

fabianschuiki · 2024-07-09T18:59:42Z

Really cool! 🥳

Interesting thought about scanf-like parsing of strings. This feels like you'd want a complementary set of parsing operations, and then either the frontend emits those directly, or maybe there's a pass that transposes the string concats into sequences of parses? Or treats them like a regex? Not entirely sure. The parsing/scanf is also harder because we don't have the liberty to assemble the string separately and pass it into a Verilog construct, like we have for printing. Although, the plusargs could theoretically accept any string and the parsing could then be done separately…

fzi-hielscher force-pushed the sim-fmt-ops branch from db990ae to 36e76dc Compare June 21, 2024 12:27

fzi-hielscher marked this pull request as ready for review June 21, 2024 20:18

fzi-hielscher added 8 commits July 9, 2024 13:53

Formatting type and token ops.

8b62d2b

Concat op.

e3a1c45

Char fixes

50b8234

Tests

f420317

Error test

ffa1230

Spelling

4e72512

Remove pointless return

a401854

Concat canonicalizer: Only do a single pass

17c9285

fzi-hielscher force-pushed the sim-fmt-ops branch from 36e76dc to 17c9285 Compare July 9, 2024 11:57

fzi-hielscher merged commit 0899943 into llvm:main Jul 9, 2024
4 checks passed

fzi-hielscher mentioned this pull request Jul 9, 2024

[Sim] Add printing operations and transformation from non-procedural to procedural flavor #7292

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC][Sim] Add format string type and format specifier ops #7208

[RFC][Sim] Add format string type and format specifier ops #7208

fzi-hielscher commented Jun 19, 2024

fabianschuiki commented Jul 2, 2024

fzi-hielscher commented Jul 2, 2024

darthscsi commented Jul 3, 2024

teqdruid commented Jul 3, 2024

prithayan commented Jul 3, 2024

fzi-hielscher commented Jul 8, 2024

fabianschuiki commented Jul 9, 2024

[RFC][Sim] Add format string type and format specifier ops #7208

[RFC][Sim] Add format string type and format specifier ops #7208

Conversation

fzi-hielscher commented Jun 19, 2024

fabianschuiki commented Jul 2, 2024

fzi-hielscher commented Jul 2, 2024

darthscsi commented Jul 3, 2024

teqdruid commented Jul 3, 2024

prithayan commented Jul 3, 2024

fzi-hielscher commented Jul 8, 2024

fabianschuiki commented Jul 9, 2024