Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Support] Add FVInt, a four-valued arbitrary precision integer #7422

Merged
merged 1 commit into from
Aug 5, 2024

Conversation

fabianschuiki
Copy link
Contributor

@fabianschuiki fabianschuiki commented Aug 2, 2024

Add the FVInt class to CIRCT's support library. This class can represent arbitrary precision integers where each bit can be one of the four values 0, 1, X, and Z. The name intends to suggest a four-valued APInt, with the option to also introduce a nine-valued NVInt in the future.

Internally, FVInt uses two APInts to store its data: value stores whether a bit is 0/X or 1/Z, and unknown stores whether a bit is known (0 or 1) or unknown (X or Z). Together they allocate 2 bits of storage for each of the FVInt's digits, which allows for four different values per digit. This representation as value and unknown makes many of the logical and arithmetic operations pretty straightforward to implement. Most four-valued operations can be trivially implemented by performing the equivalent two-valued operation on value, and then accounting for X and Z bits through a few logic operations on unknown.

Note that Slang defines its own version of this (SVInt). But since Slang is an optional dependency of CIRCT, it makes sense to have a CIRCT equivalent that is built around LLVM's APInt, for use in our dialects.

This first version of FVInt has a rather incomplete set of operations, but it covers basic AND, OR, XOR, NOT, negation, addition, subtraction, and multiplication as a proof-of-concept. The remaining operations will be added in future commits. We are also going to need a four-valued equivalent of IntegerAttr based on FVInt.

This commit is motivated by the Slang frontend, which now supports enough of SystemVerilog to make some test designs start to hit the lack of number literals with X and Z.

Copy link
Contributor

@fzi-hielscher fzi-hielscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 🎉
Just some nitpicks re zero width values.

When tinkering with nine-valued containers last year I spent a lot (arguably too much) time trying to compress the in-memory representation to avoid heap traffic. If, and only if, FVInt is intended as a general container for Verilog logic constants my intuition would still lead me to try and avoid allocating the unknown APInt for values that do not contain any X/Z. IIRC this is how slang's SVInt handles it. Of course it is only relevant for heap allocated APInts, i.e., values wider than 64 bit, so it may not actually matter in most cases.

I still have not entirely given up on my RLELogic class (#5793). I think it has some nice properties for dealing with parameterized-width IEEE 1164 values specifically. But for the slang integration your approach is a much better fit.

include/circt/Support/FVInt.h Outdated Show resolved Hide resolved
lib/Support/FVInt.cpp Outdated Show resolved Hide resolved
lib/Support/FVInt.cpp Outdated Show resolved Hide resolved
std::optional<FVInt> FVInt::tryFromString(StringRef str, unsigned radix) {
assert(radix == 2 || radix == 8 || radix == 10 || radix == 16);
if (str.empty())
return {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively I would expect fvint == FVInt::fromString(fvint.toString(r), r) to be an invariant and also hold for fvint = FVInt(0, 0). But is the right representation "0" or the empty string? I had the same problem when writing the format string ops in the sim dialect. In the end I settled for "0" for decimal representation and the empty string otherwise. Not sure if this was a solomonic judgement or just confusing.
Maybe it would be best to require the expected width of the resulting FVInt to be passed as an argument, like APInt::fromString does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point thanks, this was missing from the unit tests. Zero-width FVInts are printed as "0", since they have no unknown bits and therefore fall back to APInt::toString.

I don't think we need strict equality when round-tripping through the string representation, but we should definitely make sure that no bits are truncated or lost in doing so. I don't think we should print numbers as empty strings, though. The APInt parsing and printing already implicitly drops leading zeros when printing, and when printing decimals also creates strings that have technically more bits than the original binary representation.

FVInt::fromString is modeled after StringRef::consumeInteger, which is what ends up doing most of the integer parsing in the IR if I recall correctly. That function returns an APInt which is guaranteed to have enough bits to hold the parsed value, but it may have more. MLIR's attribute parser uses this in buildAttributeAPInt to get an APInt and then truncates it to the requested width, which also allows it to fail if the APInt is too wide. I think I'd prefer this API over having to guess a bit width upfront to pass to fromString, since it would be more cumbersome to distinguish between the parser failing because the integer was malformed, or because the bit width was insufficient.

@fabianschuiki
Copy link
Contributor Author

@fzi-hielscher Thanks a lot for the detailed review! 🥳

[...] my intuition would still lead me to try and avoid allocating the unknown APInt for values that do not contain any X/Z. IIRC this is how slang's SVInt handles it.

That is definitely a good point. My gut feeling here was to try and avoid early optimization, and instead try to get something up and running that's correct but possibly suboptimal. I think it's going to be easier to come back a while later and add these optimizations as a separate PR. And I think you're right, in practice this might be unmeasurable in almost all cases, since most integers will be very small.

Add the `FVInt` class to CIRCT's support library. This class can
represent arbitrary precision integers where each bit can be one of the
four values 0, 1, X, and Z. The name intends to suggest a *four-valued
APInt*, with the option to also introduce a *nine-valued NVInt* in the
future.

Internally, `FVInt` uses two `APInt`s to store its data: `value` stores
whether a bit is 0/X or 1/Z, and `unknown` stores whether a bit is known
(0 or 1) or unknown (X or Z). Together they allocate 2 bits of storage
for each of the `FVInt`'s digits, which allows for four different values
per digit. This representation as `value` and `unknown` makes many of
the logical and arithmetic operations pretty straightforward to
implement. Most four-valued operations can be trivially implemented by
performing the equivalent two-valued operation on `value`, and then
accounting for X and Z bits through a few logic operations on `unknown`.

Note that Slang defines its own version of this (`SVInt`). But since
Slang is an optional dependency of CIRCT, it makes sense to have a CIRCT
equivalent that is built around LLVM's `APInt`, for use in our dialects.

This first version of `FVInt` has a rather incomplete set of operations,
but it covers basic AND, OR, XOR, NOT, negation, addition, subtraction,
and multiplication as a proof-of-concept. The remaining operations will
be added in future commits. We are also going to need a four-valued
equivalent of `IntegerAttr` based on `FVInt`.

This commit is motivated by the Slang frontend, which now supports
enough of SystemVerilog to make some test designs start to hit the lack
of number literals with X and Z.
@fzi-hielscher
Copy link
Contributor

Rumor has it certain HDL generators produce ROMs as ginormous muxes of literals. At that point I feel literal performance could become a concern. On a completely unrelated note, one of my far too many experimental branches adds support for FirMemInitAttrs to arcilator. Maybe I should spend more time giving that some spit and polish rather than trying to prematurely micro-"optimize" heap allocations. 😬

@fabianschuiki
Copy link
Contributor Author

😬 That sounds like a pretty adventurous HDL generator 😛.

Your arcilator work sounds fantastic! Please hit me with all the PRs 😁

Copy link
Member

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this really looks great! Having data structure for 4 value logic does in CIRCT does make a lot sense :)

/// integer are unknown, the entire result is X.
FVInt &operator+=(const FVInt &other) {
value += other.value;
setAllXIfAnyUnknown(other);
Copy link
Member

@uenoku uenoku Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It might be a bit too conservative to mark everything unknown when lower bits are known, e.g. x10 + x10 = x00. Does IEEE 1800 define 4-value logic arithmetic as well as bitwise logical operation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've found the following nugget in IEEE 1800-2017 "11.4.3 Arithmetic operators" hidden between tables:

For the arithmetic operators, if any operand bit value is the unknown value x or the high-impedance value z, then the entire result value shall be x.

So I guess conservative is fine/desired?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "fun" really starts when you get to the unary arithmetic operators. What's the expected output here?

module test;
  logic [3:0] foo = 4'b01XZ;
  initial begin
    $display("None: %b", foo);
    $display("Plus: %b", +foo);
    $display("Minus: %b", -foo);
  end
endmodule

I've seen simulators disagree on the handling of the unary plus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nasty! I guess from that line in the standard you'd expect to see foo unchanged, and both +foo and -foo to be all X since they count as arithmetic operators? Unless there's a line in the standard that says unary + can be treated as a no-op… But if that doesn't exist, we'd need a moore.x_if_any_unknown style op 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out Table 11-5 right above the sentence you quoted.
cantbeliveididthis
I'll see myself out....

But most simulators seem to opt for "same as m".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a beautiful spec... But yeah, +m = m feels like the right thing to do intuitively, but it also seems to violate the spec. We can always drop an op that handles the X-ification for +.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess conservative is fine/desired?

Oh, yeah. Thank you for pointing it out!

@fabianschuiki fabianschuiki merged commit cbdee94 into main Aug 5, 2024
4 checks passed
@fabianschuiki fabianschuiki deleted the fschuiki/fvint branch August 5, 2024 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants