Skip to content

Conversation

@algoidurovic
Copy link
Contributor

Summary

Adding the match opcode, which is an extension of the switch opcode, where the later is a dense switch and the former a sparse switch. The match opcode is also able to match on bytes values, where switch only supports ints. The pushints and pushbytess were added to reduce the size of encoding match cases/constants as stack arguments. The alternative is to push the match cases onto the stack individually, taking up more space to store the additional opcodes. It's possible that this approach is better in some cases if the constants are stored in constant blocks so HLL compilers will need to be aware of this.

Test Plan

Assembly and evaluation tests.

@algoidurovic algoidurovic requested a review from jannotti October 14, 2022 16:44
Comment on lines 1129 to 1132
- construct a list of match cases from B_1 to B_N and branch to the Ith label where B_I = A. Continue to the following instruction if no matches are found.
- Availability: v8

`match` consumes N+1 values from the stack. Let the top stack value be A. The following N values represent an ordered list of match cases/constants (B), where the first value (B[0]) is the deepest in the stack. The immediate arguments are an ordered list of N labels (L). `match` will branch to the L[I], where B[I] = A. If there are no matches then execution continues on to the next instruction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call it B_1 to B_N above, then B[0] in description.

l = binary.PutUvarint(scratch[:], cu)
ops.pending.Write(scratch[:l])
if !ops.known.deadcode {
if !ops.known.deadcode { // this conditional doesn't seem necessary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows intcblock in deadcode to exist without mucking up the assemblers belief about what is in the cblock.

Comment on lines 5701 to 5704
int 0
int 1
int 0
match zero one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do these test with big weird numbers like 37 and 62 or some such, so it's obvious at a glance that we're not talking about indices into the labels.

matchVal := cx.stack[top]
cx.stack = cx.stack[:top]

top = len(cx.stack) - n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird to call this top. It's the base of the match values, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking of top as representing the new top of the stack after the match cases are removed from the stack but I can see how that's imprecise/confusing.

@codecov
Copy link

codecov bot commented Oct 17, 2022

Codecov Report

Merging #4645 (97b008e) into master (db7d9c3) will increase coverage by 0.04%.
The diff coverage is 84.88%.

@@            Coverage Diff             @@
##           master    #4645      +/-   ##
==========================================
+ Coverage   54.33%   54.37%   +0.04%     
==========================================
  Files         402      402              
  Lines       51828    51859      +31     
==========================================
+ Hits        28159    28198      +39     
+ Misses      21297    21292       -5     
+ Partials     2372     2369       -3     
Impacted Files Coverage Δ
data/transactions/logic/doc.go 61.53% <ø> (ø)
data/transactions/logic/opcodes.go 84.56% <ø> (ø)
data/transactions/logic/assembler.go 86.37% <81.57%> (+0.97%) ⬆️
data/transactions/logic/eval.go 89.35% <87.23%> (-0.05%) ⬇️
data/transactions/logic/frames.go 100.00% <100.00%> (ø)
cmd/algoh/blockWatcher.go 77.77% <0.00%> (-3.18%) ⬇️
network/wsPeer.go 65.50% <0.00%> (-2.14%) ⬇️
data/transactions/verify/txn.go 76.19% <0.00%> (-0.96%) ⬇️
ledger/tracker.go 74.04% <0.00%> (-0.86%) ⬇️
... and 3 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@algoidurovic algoidurovic requested a review from jannotti October 17, 2022 17:26

## pushbytess bytes ...

- Opcode: 0x82 {varuint length} [({varuint value length} bytes), ...]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we change the first length to count?

For consistency, we would do this for the cblock opcodes and pushints too.


- Opcode: 0x8e {uint8 branch count} [{int16 branch offset, big-endian}, ...]
- Stack: ..., [N items] &rarr; ...
- construct a list of match cases from B[0] to B[N] and branch to the Ith label where B[I] = A. Continue to the following instruction if no matches are found.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

B[0] to B[N] implies N+1 targets.

I'd avoid talk of "construct a list"

The input stack doesn't seem right. It should have N items and a value, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, seems I've been sloppy with the docs.

Comment on lines 1903 to 1909
finalLen := len(cx.stack) + len(intc)
if cap(cx.stack) < finalLen {
// Let's grow all at once, plus a little slack.
newStack := make([]stackValue, len(cx.stack), finalLen+4)
copy(newStack, cx.stack)
cx.stack = newStack
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're doing this in three places now. Maybe we should have a function, grow or ensure?

Comment on lines 2177 to 2180
// first we must verify that the matchList has the expected length and all types match the type of matchVal
if len(matchList) != n {
return fmt.Errorf("list of match constants has length %d when length %d was expected", len(matchList), n)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impossible, right? You've confirmed the stack is big enough, then sliced it up properly.

(Certainly the comment is out of date.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the check and comment are both out of date.

{0x8c, "frame_bury", opFrameBury, proto("a:"), fpVersion, immKinded(immInt8, "i").typed(typeFrameBury)},
{0x8d, "switch", opSwitch, proto("i:"), 8, detSwitch()},
// 0x8e will likely be a switch on pairs of values/targets, called `match`
{0x8e, "match", opMatch, proto(":", "[N items]", ""), 8, detSwitch().trust()},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where you need something like, "[N items], A".

Copy link
Contributor

@jannotti jannotti Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, how about using:
"[T1 T2 T3 ... TN], B"
That way you have names for the targets (which I renamed T for target instead of B for branch to prevent confusion because we normally named the arguments A and B with A deeper in the stack).

That should make your documentation of the behavior easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, they are not targets, they are match values. So I'd be ok with "[A1 A2 A3 ... An], B"

@algoidurovic algoidurovic requested a review from jannotti October 18, 2022 16:50
| `bytec_2` | constant 2 from bytecblock |
| `bytec_3` | constant 3 from bytecblock |
| `pushbytes bytes` | immediate BYTES |
| `pushbytess bytes ...` | push sequences of immediate bytes to stack |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should also say "(first byte array being deepest)"

PS, we try to consistently say "byte array" rather than "bytes".

## pushbytes bytes

- Opcode: 0x80 {varuint length} {bytes}
- Opcode: 0x80 {varuint count} {bytes}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this one should be changed. It's the length of the byte array, not the count of how many byte arrays there are.

Comment on lines 1128 to 1129
- Stack: ..., [A1, A2, ..., AN], B &rarr; ...
- given match cases from A[0] to A[N-1], branch to the Ith label where A[I] = B. Continue to the following instruction if no matches are found.
Copy link
Contributor

@jannotti jannotti Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some disagreement on whether A is 0 based or 1 based. I vote for 1 based, only so the "Stack:" line can avoid using "N-1" as a "subscript".

continue
}

if matchVal.argType() == StackBytes && string(matchVal.Bytes) == string(stackArg.Bytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if matchVal.argType() == StackBytes && string(matchVal.Bytes) == string(stackArg.Bytes) {
if matchVal.argType() == StackBytes && bytes.Equal(matchVal.Bytes, stackArg.Bytes) {

- push sequences of immediate byte arrays to stack (first byte array being deepest)
- Availability: v8

pushbytess args are not added to the bytecblock during assembly processes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need these caveats? I'd rather remove them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind removing it. I thought that it'd be useful to communicate in some way that these opcodes aren't necessarily more space efficient than a series of byte/int instructions. pushbytes/pushint have the same caveat in the docs.

matchedIdx := n
for i, stackArg := range matchList {
if stackArg.argType() != matchVal.argType() {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have any tests that mix bytes and ints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more to the eval tests.

@algoidurovic algoidurovic requested a review from jannotti October 21, 2022 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants