Add tx partitioner #1799

aaronbuchwald · 2024-11-18T22:46:38Z

This PR adds transaction partitioning based on the sponsor address into the DSMR package.

This will be used to partition transactions to be assigned to a specific node for chunk production/verification and targeted transaction gossip.

tsachiherman · 2024-11-20T15:07:44Z

x/dsmr/partition.go

+	weightedValidators := make([]*weightedValidator, 0, len(vdrs))
+	totalWeight := uint64(0)
+	for _, vdr := range vdrs {
+		weightedValidators = append(weightedValidators, &weightedValidator{
+			weight: vdr.Weight,
+			nodeID: vdr.NodeID,
+		})
+		totalWeight += vdr.Weight
+	}


would be marginally better to implement as :

weightedValidators := make([]*weightedValidator, len(vdrs)) totalWeight := uint64(0) for i, vdr := range vdrs { weightedValidators[i] = &weightedValidator{ weight: vdr.Weight, nodeID: vdr.NodeID, } totalWeight += vdr.Weight } ...

vdrs is a map, so it would need to be:

i := 0 for _, vdr := range vdrs { weightedValidators[i] = &weightedValidator{ weight: vdr.Weight, nodeID: vdr.NodeID, } totalWeight += vdr.Weight i++ }

going to leave this as is since I'd prefer to skip the extra variable

tsachiherman · 2024-11-20T15:11:24Z

x/dsmr/partition.go

+}
+
+func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {
+	sponsor := tx.GetSponsor()


add

if pp.totalWeight ==0 || len(pp.validators) == 0 { return ids.NodeID{}, false }

tsachiherman · 2024-11-20T15:13:46Z

x/dsmr/partition.go

+	return binary.BigEndian.Uint64(sponsor[len(sponsor)-consts.Uint64Len:]) % totalWeight
+}
+
+func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {


the implementation of this function is not very efficient.
instead, in PrecalculatedPartition we want to have an array called
accumulatedTotalWeights where each elements would contain the total accumulated weight until that validator.
Then, in AssignTx, we want to perform a binary search ( O(logn) ) instead of array scan ( O(n) ).

joshua-kim · 2024-11-20T20:49:27Z

x/dsmr/partition.go

+func getWeightedVdrsFromState(ctx context.Context, state validators.State, pChainHeight uint64, subnetID ids.ID) ([]*weightedValidator, error) {
+	vdrs, err := state.GetValidatorSet(ctx, pChainHeight, subnetID)
+	if err != nil {
+		return nil, err
+	}
+	weightedValidators := make([]*weightedValidator, 0, len(vdrs))
+	for _, vdr := range vdrs {
+		weightedValidators = append(weightedValidators, &weightedValidator{
+			weight: vdr.Weight,
+			nodeID: vdr.NodeID,
+		})
+	}
+	return weightedValidators, nil
+}
+
+func precomputePartition[T Tx](validators []*weightedValidator) *PrecalculatedPartition[T] {
+	utils.Sort(validators)
+	accumulatedWeight := uint64(0)
+	for _, weightedVdr := range validators {
+		accumulatedWeight += weightedVdr.weight
+		weightedVdr.accumulatedWeight = accumulatedWeight
+	}
+	return &PrecalculatedPartition[T]{
+		validators:  validators,
+		totalWeight: accumulatedWeight,
+	}
+}


I feel like we should just merge these two functions... or just inline all of this code into CalculatePartition

joshua-kim · 2024-11-20T20:51:39Z

x/dsmr/partition.go

+func CalculatePartition[T Tx](ctx context.Context, state validators.State, pChainHeight uint64, subnetID ids.ID) (*PrecalculatedPartition[T], error) {
+	weightedValidators, err := getWeightedVdrsFromState(ctx, state, pChainHeight, subnetID)
+	if err != nil {
+		return nil, err
+	}
+
+	return precomputePartition[T](weightedValidators), nil
+}


Another idea is to not accept the validators.State interface which is a vm/consensus abstraction. We could just accept a slice of []Validator where Validator is a struct of a node id + weight so this signature is just CalculatePartition(context.Context, []Validator) which makes the unit tests a bit nicer since we won't have to use the awkward validators.State interface which requires a lot of boiler-plate setup. The tradeoff is that the VM would have to copy into the validator type that dsmr defines though.

Another idea would be to just depend on map[ids.NodeID]*GetValidatorsOutput instead of this interface.

joshua-kim · 2024-11-20T20:55:54Z

x/dsmr/partition.go

+}
+
+func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {
+	if pp.totalWeight == 0 || len(pp.validators) == 0 {


isn't pp.totalWeight == 0 the same as len(pp.validators) == 0? You can only have a positive non-zero amount of stake on a validator so it seems weird that we check for both cases.

In reality, we should never have either case. This is defensive, I don't feel too strongly and am happy to remove either or both.

Removed both checks

joshua-kim · 2024-11-20T21:04:58Z

x/dsmr/partition.go

+	// Defensive: this should never happen
+	if nodeIDIndex >= len(pp.validators) {
+		return ids.NodeID{}, false
+	}


Should we just panic and break loudly as opposed to quietly continuing to operate in a state we shouldn't be in? Wondering if leaving this check will mask a problem in the future if this does end up breaking since this case shouldn't be possible anyways...

Also sort.Search only returns len(slice) if it's not found, so we should be checking for == instead of >=.

joshua-kim · 2024-11-20T21:10:42Z

x/dsmr/partition.go

+	return indexedTxs
+}
+
+func (pp *PrecalculatedPartition[T]) AssignTxs(txs []T) (map[ids.NodeID][]T, error) {


Why do we need both this and AssignTx? Can't we just call AssignTx in a loop if it wanted this behavior?

This is faster, but could be premature optimization

joshua-kim · 2024-11-20T21:13:48Z

x/dsmr/partition.go

+	return assignments, nil
+}
+
+func (pp *PrecalculatedPartition[T]) FilterTxs(nodeID ids.NodeID, txs []T) ([]T, error) {


Can't we also get rid of this function as well? Seems like filter could just be implemented by calling AssignTx in a loop over txs and just filtering out anything not returning nodeID.

joshua-kim · 2024-11-20T21:21:36Z

x/dsmr/partition.go

+	return filteredTxs, nil
+}
+
+type Partition[T Tx] struct {


nit: Maybe we call this PartitionCache?
Q: I'm also not sure how this is going to be used so I'm wondering if this should live in dsmr or if this should live in the vm.

This belongs in DSMR imo because transactions need to be partitioned to efficiently build chunks with non-overlapping transactions.

This is sort of part of "fortification," but even without malicious users/validators, we need to load balance which validators include transactions in their chunks to make this efficient.

joshua-kim · 2024-11-20T21:22:25Z

x/dsmr/partition.go

+}
+
+type PrecalculatedPartition[T Tx] struct {
+	validators  []*weightedValidator


nit: pointer seems like overkill for this since it's such a small struct... but I don't have performance data to say why it's right or wrong... so feel free to ignore this comment.

joshua-kim · 2024-11-20T21:23:13Z

x/dsmr/partition.go

+	}
+}
+
+func CalculatePartition[T Tx](ctx context.Context, state validators.State, pChainHeight uint64, subnetID ids.ID) (*PrecalculatedPartition[T], error) {


nit: Personal preference but I generally dislike calling constructors anything other than New*.

joshua-kim · 2024-11-20T21:23:40Z

x/dsmr/partition.go

+	return binary.BigEndian.Uint64(sponsor[len(sponsor)-consts.Uint64Len:]) % totalWeight
+}
+
+func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {


nit: pp -> p (I just use the first letter of the type almost religiously)

joshua-kim · 2024-11-20T21:25:59Z

x/dsmr/partition.go

+	accumulatedWeight uint64
+}
+
+type PrecalculatedPartition[T Tx] struct {


Don't care where we put this struct definition but I prefer defining the struct next to adjacent to wherever its corresponding functions live.

Regarding naming, PrecalculatedPartition feels awkward to me because I don't care if it's pre-calculated or not as the caller... should we just call this something like Partitioner since it's the thing that gives you a partition?

Removing for now to just define the partition function

joshua-kim · 2024-11-21T17:19:32Z

x/dsmr/partition_test.go

+			for i, vdr := range partition.validators {
+				if vdr.nodeID == nodeID {
+					foundNodeIDIndex = i
+					break
+				}
+			}


Can't we use sort.Search instead of this?

aaronbuchwald added 4 commits November 18, 2024 17:44

Add tx partitioner

5010a53

fix lint

f0b8b15

update fuzz test

366314e

go mod tidy morpheus

ed194d9

aaronbuchwald marked this pull request as ready for review November 19, 2024 18:17

aaronbuchwald requested a review from joshua-kim as a code owner November 19, 2024 18:17

aaronbuchwald self-assigned this Nov 19, 2024

aaronbuchwald mentioned this pull request Nov 19, 2024

Support targeted gossip #1801

Merged

tsachiherman reviewed Nov 20, 2024

View reviewed changes

Address comments

818a67f

joshua-kim reviewed Nov 20, 2024

View reviewed changes

aaronbuchwald added 2 commits November 21, 2024 10:55

remove AssignTxs and FilterTxs

bbbd568

Address comments and add benchmark

7a5e210

joshua-kim reviewed Nov 21, 2024

View reviewed changes

joshua-kim approved these changes Nov 21, 2024

View reviewed changes

aaronbuchwald merged commit 7ab827d into main Nov 21, 2024
17 checks passed

aaronbuchwald deleted the dsmr-partition branch November 21, 2024 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tx partitioner #1799

Add tx partitioner #1799

aaronbuchwald commented Nov 18, 2024

tsachiherman Nov 20, 2024

aaronbuchwald Nov 20, 2024

tsachiherman Nov 20, 2024

tsachiherman Nov 20, 2024 •

edited

Loading

aaronbuchwald Nov 20, 2024

joshua-kim Nov 20, 2024

joshua-kim Nov 20, 2024

joshua-kim Nov 20, 2024

aaronbuchwald Nov 21, 2024

aaronbuchwald Nov 21, 2024

joshua-kim Nov 20, 2024

aaronbuchwald Nov 21, 2024

joshua-kim Nov 20, 2024

aaronbuchwald Nov 21, 2024

joshua-kim Nov 20, 2024

aaronbuchwald Nov 21, 2024

joshua-kim Nov 20, 2024

aaronbuchwald Nov 21, 2024

joshua-kim Nov 20, 2024

joshua-kim Nov 20, 2024

joshua-kim Nov 20, 2024

joshua-kim Nov 20, 2024

aaronbuchwald Nov 21, 2024

joshua-kim Nov 21, 2024

Add tx partitioner #1799

Add tx partitioner #1799

Conversation

aaronbuchwald commented Nov 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiherman Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiherman Nov 20, 2024 •

edited

Loading