Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize chain.Action.Marshal function #1198

Merged
merged 95 commits into from
Aug 30, 2024

Conversation

containerman17
Copy link
Contributor

@containerman17 containerman17 commented Jul 25, 2024

As we aim to reduce the amount of code in VMs by following a "Sane Defaults" approach, I propose removing the need for explicit Marshal/Unmarshal functions in actions. Instead, we should use automatic marshalling by default, with an option for easy overrides.

Currently, manually marshalling 100,000 actions takes 40ms. Under this PR, it would take 94ms. An additional 54ms per 100k transactions seems acceptable.

  • Implement JavaScript counterpart of marshal/unmarshal
  • Allow overriding Marshal/Unmarshal
  • Integrate it into hypersdk
  • Explore a further speed up with go:generate
  • Check if Fuzz tests are working and required

@containerman17 containerman17 self-assigned this Jul 25, 2024
@containerman17
Copy link
Contributor Author

containerman17 commented Aug 1, 2024

Benchmark Results

TL;DR

It is double or even triple slower, but no one cares because marshal-unmarshal takes an insignificant portion of time at 100k TPS.

Transfer Action

The current manual approach takes 13ms per 100k cycles when single-threaded and 5ms when using 8 threads. The proposed reflection-based solution takes 44ms for single-threaded and 15ms for 8-threaded cycles. 15ms for 100k full cycles! Even on an 8-core machine (and potentially 16-32 cores), it is fast enough.

A Struct with 100 Inner Structs

For an exaggerated structure with an array of 100 inner structs, the 8-threaded result for the reflection-based approach takes 1067ms for 100k cycles, while the manual approach takes 535ms.

A Megabyte []byte Array

For an array with a million byte elements over 100k iterations, the reflection approach takes 13084ms on 8 cores, and the manual approach takes 14082ms. I commented out this test as there is no significant difference when comparing a single-field struct.

Raw results

// goos: linux
// goarch: amd64
// pkg: github.com/ava-labs/hypersdk/chain
// cpu: AMD EPYC 7763 64-Core Processor
// BenchmarkMarshalUnmarshal/Transfer-Reflection-1-8                     25          44908275 ns/op        35200248 B/op     500002 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Reflection-2-8                     39          28402783 ns/op        35200235 B/op     500003 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Reflection-4-8                     52          19412363 ns/op        35200397 B/op     500006 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Reflection-8-8                     67          15940310 ns/op        35201158 B/op     500012 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Manual-1-8                         79          13639643 ns/op        14400048 B/op     200002 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Manual-2-8                        138           8520965 ns/op        14400173 B/op     200003 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Manual-4-8                        190           6174020 ns/op        14400186 B/op     200005 allocs/op
// BenchmarkMarshalUnmarshal/Transfer-Manual-8-8                        230           5177846 ns/op        14400366 B/op     200009 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Reflection-1-8                       8         131652394 ns/op        62400150 B/op    1200002 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Reflection-2-8                      15          79716490 ns/op        62400238 B/op    1200003 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Reflection-4-8                      22          53110410 ns/op        62400367 B/op    1200005 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Reflection-8-8                      27          42143834 ns/op        62400900 B/op    1200010 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Manual-1-8                          19          57420578 ns/op        40800108 B/op     900002 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Manual-2-8                          33          33921873 ns/op        40800186 B/op     900003 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Manual-4-8                          51          22917034 ns/op        40800419 B/op     900005 allocs/op
// BenchmarkMarshalUnmarshal/Complex-Manual-8-8                          60          18573280 ns/op        40800467 B/op     900009 allocs/op
// PASS
// ok      github.com/ava-labs/hypersdk/chain      20.763s

Transfer struct setup

type Transfer struct {
	To    codec.Address `json:"to"`
	Value uint64        `json:"value"`
	Memo  []byte        `json:"memo"`
}

transfer := Transfer{
	To:    codec.Address{1, 2, 3, 4, 5, 6, 7, 8, 9},
	Value: 12876198273671286,
	Memo:  []byte("Hello World"),
}

Complex struct setup

type InnerStruct struct {
	Field1 int32
	Field2 string
	Field3 bool
	Field5 []byte
}

type TestStruct struct {
	Uint64Field uint64
	StringField string
	BytesField  []byte
	IntField    int
	BoolField   bool
	Uint16Field uint16
	Int8Field   int8
	InnerField  []InnerStruct
}

test := TestStruct{
	Uint64Field: 42,
	StringField: "Hello, World!",
	BytesField:  []byte{1, 2, 3, 4, 5},
	IntField:    -100,
	BoolField:   true,
	Uint16Field: 65535,
	Int8Field:   -128,
	InnerField: func() []InnerStruct {
		inner := make([]InnerStruct, 100)
		for i := 0; i < 100; i++ {
			inner[i] = InnerStruct{
				Field1: int32(i),
				Field2: fmt.Sprintf("Inner string %d", i),
				Field3: i%2 == 0,
				Field5: []byte{byte(i), byte(i + 1), byte(i + 2), byte(i + 3), byte(i + 4)},
			}
		}
		return inner
	}(),
}

All of this is time per 100k cycles, not for a single unmarshal!

@containerman17
Copy link
Contributor Author

containerman17 commented Aug 1, 2024

func (p *Packer) PackInt(v int) {
	p.p.PackInt(uint32(v))
}

func (p *Packer) UnpackInt(required bool) int {
	v := p.p.UnpackInt()
	if required && v == 0 {
		p.addErr(fmt.Errorf("%w: Int field is not populated", ErrFieldNotPopulated))
	}
	return int(v)
}

I clearly restricted it to int32. As the previous implementation assumed that we are working on a 32-bit machine. Here is the new implementation:

// PackInt now accepts uint32 to ensure full support for 64-bit machines
func (p *Packer) PackInt(v uint32) {
	p.p.PackInt(v)
}

func (p *Packer) UnpackInt(required bool) uint32 {
	v := p.p.UnpackInt()
	if required && v == 0 {
		p.addErr(fmt.Errorf("%w: Int field is not populated", ErrFieldNotPopulated))
	}
	return v
}

Here is a test showing the bug:

func TestPackInt(t *testing.T) {
	require := require.New(t)
	wp := NewWriter(5, 5)
	wp.PackIntOld(math.MaxInt64)
	require.NoError(wp.Err(), "Error packing int.")

	rp := NewReader(wp.Bytes(), 5)
	unpackedVal := rp.UnpackIntOld(true)
	require.NoError(rp.Err(), "Error unpacking int.")

	require.EqualValues(math.MaxInt64, unpackedVal, "Unpacked value does not match packed value.")
}
// --- FAIL: TestPackInt (0.00s)
//     packer_test.go:134: 
//                 Error Trace:    /workspaces/hypersdk/codec/packer_test.go:134
//                 Error:          Not equal: 
//                                 expected: 9223372036854775807
//                                 actual  : 4294967295
//                 Test:           TestPackInt
//                 Messages:       Unpacked value does not match packed value.

@containerman17
Copy link
Contributor Author

I redid the PR today and minimized the number of changes.

containerman17 added a commit that referenced this pull request Aug 29, 2024
chain/dependencies.go Outdated Show resolved Hide resolved
codec/packer.go Outdated Show resolved Hide resolved
codec/packer.go Outdated Show resolved Hide resolved
codec/packer.go Outdated
Comment on lines 151 to 156
// Deprecated: Use PackBytes for better performance.
func (p *Packer) PackString(s string) {
p.p.PackStr(s)
}

// Deprecated: Use UnpackBytes for better performance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why mark these as deprecated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpacking strings is very slow, much slower than working with the same data in bytes. It should never be used in on-chain operations. However, it is used in the WebSocket server, where performance is not a critical concern.

chain/dependencies.go Outdated Show resolved Hide resolved
codec/type_parser.go Outdated Show resolved Hide resolved
examples/morpheusvm/registry/registry.go Outdated Show resolved Hide resolved
chain/actions.go Outdated Show resolved Hide resolved
chain/actions_test.go Outdated Show resolved Hide resolved
chain/actions_test.go Outdated Show resolved Hide resolved
chain/transaction_test.go Outdated Show resolved Hide resolved
chain/transaction_test.go Outdated Show resolved Hide resolved
@containerman17 containerman17 marked this pull request as ready for review August 30, 2024 02:41
aaronbuchwald
aaronbuchwald previously approved these changes Aug 30, 2024
Copy link
Collaborator

@aaronbuchwald aaronbuchwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with two nits that we should include as code quality improvements in the tests

@containerman17 containerman17 merged commit 7ce0081 into main Aug 30, 2024
15 checks passed
containerman17 added a commit that referenced this pull request Sep 12, 2024
* naive first marshalling attemmpt

* int 8-int64 supported

* negative numbers support

* support maps

* funky benchmark

* structs reflection caching attempt

* some fuzz tests

* fix speed comment

* relocate implementation out of test

* benchmark

* rename test to TestMakeSureMarshalUnmarshalIsNotTooSlow

* fix fuzz test

* spec tests for js implant

* update spec tests

* remove fuzz test

* simplify to 2 types

* restore original logic

* rewrite marshal with avalanchego's wrappers.Packer

* pack bytes with uint32 and everything else with uint16

* check for long arrays and strings, marshal maps with uint16

* update benchmark results

* move to codec

* come back to codec.packer

* speed up TestMakeSureMarshalUnmarshalIsNotTooSlow a bit

* support pointer to a struct

* auto marshaller integration

* lint

* remove a slow test breaking CI

* fix linter errors

* faster reflection cache

* minimize test to exclude testing errors

* unsafe type caching

* simplify benchmark

* update benchmarks

* add benchmark results

* add benchmem

* add benchmem results

* deprecate string operations

* move empty address error

* empty file

* lint

* remove .prof

* correct 'marshall' to 'marshal' according to Go conventions

* simplify codec.Packer

* get chainid from tmpnet instead of the platform (#1458)

* lint

* change to linearcodec

* lint

* go mod tidy

* add serialize tag to Transfer

* abi generation

* spec tests

* ABI in RPC

* HasTypeID as a separate iface

* add TypeParser.GetRegisteredTypes method

* move ABI to core API

* remove unused errors

* auto size calculation

* rename LinearCodecInstance

* lint

* update from #1198

* from clean slate

* return abi logic

* nit: remove function name in panics

* transaction test nit

* catch up with main

* lint

* rename HasTypeID to Typed

* separate package for abi

* restore spec tests

* treat codec.Address as a byte array while serializing

* comments and nits

* use set.Set instead of map[reflect.Type]bool

* lint

* remove memo field

* require serialize=true

* remove a breaker

* calculate ABI in place

* use ABI as struct in implementation

* stable ABI hash

* ABI wants its own ABI

* rename abi to vmabi

* remove ABI for ABI

* redo map as an array

* clean up test specs a bit

* basic codegen and refactor tests WIP

* trying different naming

* spec simplification WIP

* go generate

* lower case json

* proper codegen test

* further simplify spec tests

* transfer test

* file based spec test

* simplify tests

* ABI of ABI

* Outer/Inner struct tests for TS debug

* lint

* check ABI

* lost in merge

* remove comment lines in abi test

* remove a debug statement

* move mock abi file

* rename to abigen

* use cobra

* Update codec/address.go

Co-authored-by: aaronbuchwald <aaron.buchwald56@gmail.com>
Signed-off-by: containerman17 <8990432+containerman17@users.noreply.github.com>

* require that the " characters in address string

* rename ABI-related stuff

* fix tests after renaming

* test full marshal cycle

* TestDescribeVM

* Update abi/codegen.go

Co-authored-by: aaronbuchwald <aaron.buchwald56@gmail.com>
Signed-off-by: containerman17 <8990432+containerman17@users.noreply.github.com>

* remove StringAsBytes

* add unicode package

* lint

* go mod tidy

* flatten types def in abi

* don't use mixed receivers

* inline vm.Hash into a test

* rename abi.VM to abi.ABI

* remove embed

* funish renaming

* get rid of vm.getabi

* nit avoid redundant import alias

* DescribeVM -> NewABI

* lint

* mock gen

* share typesAlreadyProcessed across describing multiple actions

* put a comment on each test

* Update abi/auto_marshal_abi_spec_test.go

Co-authored-by: aaronbuchwald <aaron.buchwald56@gmail.com>
Signed-off-by: containerman17 <8990432+containerman17@users.noreply.github.com>

* nit: objectBytes

* remove mustPrintOrderedJSON

* comment on empty names

* comment on IsUpper

* revert typealias

* TODO here to switch to the new address format

* We should follow the style of funcName does X

* use rune in cobra

* comment on Dereference

* comment on serialize tag

* use %s and t instead of t.String()

* readme

* rename mockabi_test

* lint

---------

Signed-off-by: containerman17 <8990432+containerman17@users.noreply.github.com>
Co-authored-by: aaronbuchwald <aaron.buchwald56@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants